【VNDS-3250】闅c伄鍥e湴濡?鎬ф銉犮儵銉犮儵 娣倝銇叞浣裤亜 Seurat V5|一个函数就能科罚多种去批次范例,按需尝试
Seurat 是单细胞RNA数据分析的一个十分主流的R包,升级到面前V5版块后,会带来一些不友好的地点,然而也有一些功能上的升级,宇宙一定证据我方的情况和分析需求来详情是否升级。V5的升级部分主要体面前以下4个方面(https://satijalab.org/seurat/articles/get_started_v5_new),本次先先容第一个:Seurat V5中去批次范例的集成。图片【VNDS-3250】闅c伄鍥e湴濡?鎬ф銉犮儵銉犮儵 娣倝銇叞浣裤亜
Seurat v5引入了愈加生动和精简的基础架构,不错用一转代码完成不同的集成去批次算法,极大的减少了不同范例的环境准备和数据处理本领,不错更聚焦在使用哪种范例放弃更好。这使得探索不同集成范例的放弃变得愈加容易,并将这些放弃与抹杀集成范例的职责流进行比拟。本文以ifnb数据集手脚示例,展示去批次的进程和范例。一 R包,数据准备
1 载入R包下载有关的R包,预防面前径直install.packages('Seurat')默许装置的即是V5版块。library(Seurat)library(SeuratData)#remotes::install_github("satijalab/seurat-wrappers")remotes::install_local("./seurat-wrappers-master.zip",upgrade = F,dependencies = F)library(SeuratWrappers)library(ggplot2)library(patchwork)options(future.globals.maxSize = 1e9)该系列会有较多的R包是在github中,可能存在无法装置的问题。以satijalab/seurat-wrappers为例,当github的包无法下载时候,不错找到github地址然后点击Code,下载zip文献,
图片
然后使用remotes::install_local的面貌 腹地装置。2 下载示例数据测试数据集通常在外网,受限于上网面貌和网速,也梗概率会报错。无法下载的不错尝试下载到腹地然后再装置(http://seurat.nygenome.org/src/contrib/ifnb.SeuratData_3.1.0.tar.gz),更多量据集的称呼以及下载畅达参考https://zhuanlan.zhihu.com/p/661800023https://zhuanlan.zhihu.com/p/661800023 。# 下载测试数据集#InstallData("ifnb")install.packages('./ifnb.SeuratData_3.1.0.tar.gz', repos = NULL, type = "source")下载后载入数据,然后稽查待处理的批次情况(stim列)
# load in the pbmc systematic comparative analysis datasetobj <- LoadData("ifnb")obj <- UpdateSeuratObject(obj)obj <- subset(obj, nFeature_RNA > 1000)objAn object of class Seurat 14053 features across 1254 samples within 1 assay Active assay: RNA (14053 features, 0 variable features) 2 layers present: counts, data
图片
不错看到Seurat V5一个很大的变化即是layer。二 数据整合(批次处理)
1,数据拆分示例的Seurat对象中包含2种不同处理的数据(meta的stim列),使用Seurat v5 整合时是拆分为不同的layer 而无需拆分为多个对象。不错看到拆分后出现4个layer (stim列中的每个批次皆有我方的count和data矩阵)。Seurat V4 需要将数据拆分为2个不同的Seurat对象。obj[["RNA"]] <- split(obj[["RNA"]], f = obj$stim)objAn object of class Seurat 14053 features across 1254 samples within 1 assay Active assay: RNA (14053 features, 0 variable features) 4 layers present: counts.CTRL, counts.STIM, data.CTRL, data.STIM请预防,由于数据被分红几层,因此对每一批次孤立扩展归一化和HVG 。(自动识别一组一致的变量特征)。
obj <- NormalizeData(obj)obj <- FindVariableFeatures(obj)obj <- ScaleData(obj)obj <- RunPCA(obj)
图片
这里会针对每个“batch”分别进行NormalizeData 和 FindVariableFeatures。2 数据径直归并(不去批次)先尝试径直归并的面貌,稽查数据的批次情况#径直整合obj <- FindNeighbors(obj, dims = 1:30, reduction = "pca")obj <- FindClusters(obj, resolution = 2, cluster.name = "unintegrated_clusters")obj <- RunUMAP(obj, dims = 1:30, reduction = "pca", reduction.name = "umap.unintegrated")DimPlot(obj, reduction = "umap.unintegrated", group.by = c("stim", "seurat_annotations"))
图片
小萝莉穴3,一转代码去批次Seurat v5中的integratelayer函数缓助一转代码完成去批次集身分析,面前缓助以下五种主流的单细胞集成去批次范例。Anchor-based CCA integration (method=CCAIntegration)Anchor-based RPCA integration (method=RPCAIntegration)Harmony (method=HarmonyIntegration)FastMNN (method= FastMNNIntegration)scVI (method=scVIIntegration)#CCAobj <- IntegrateLayers( object = obj, method = CCAIntegration, orig.reduction = "pca", new.reduction = "integrated.cca", verbose = FALSE)#RPCAobj <- IntegrateLayers( object = obj, method = RPCAIntegration, orig.reduction = "pca", new.reduction = "integrated.rpca", verbose = FALSE)#Harmonyobj <- IntegrateLayers( object = obj, method = HarmonyIntegration, orig.reduction = "pca", new.reduction = "harmony", verbose = FALSE)#FastMNNobj <- IntegrateLayers( object = obj, method = FastMNNIntegration, new.reduction = "integrated.mnn", verbose = FALSE)obj
图片
可见新加多了4种去批次范例,底下即是按序可视化,然后选拔最终的范例络续后续分析。还要预防界说new.reduction的名字,否则会被掩饰掉。4,详情去批次范例4.1 ,umap展示这里用CCA 和 RPCA 示例,其他的两种通常的面貌,预防修改reduction.name 。#####CCA######obj <- FindNeighbors(obj, reduction = "integrated.cca", dims = 1:30)obj <- FindClusters(obj, resolution = 2, cluster.name = "cca_clusters")obj <- RunUMAP(obj, reduction = "integrated.cca", dims = 1:30, reduction.name = "umap.cca")p1 <- DimPlot( obj, reduction = "umap.cca", group.by = c("Method", "CellType", "cca_clusters"), combine = FALSE, label.size = 2)#####RPCA######obj <- FindNeighbors(obj, reduction = "integrated.rpca", dims = 1:30)obj <- FindClusters(obj, resolution = 2, cluster.name = "rpca_clusters")obj <- RunUMAP(obj, reduction = "integrated.rpca", dims = 1:30, reduction.name = "umap.rpca")p2 <- DimPlot( obj, reduction = "umap.rpca", group.by = c("Method", "CellType", "rpca_clusters"), combine = FALSE, label.size = 2)wrap_plots(c(p1, p2), ncol = 2, byrow = F)
图片
对比径直归并,不错看到不同stim之间的批次效应被整合,不错加上另两种同期展示4种范例,面前一种进行后续的分析。4.2 Marker 可视化还不错运用经典marker比拟不同去批次范例的说明(1)VlnPlot 图p1 <- VlnPlot( obj, features = "rna_CD8A", group.by = "unintegrated_clusters") + NoLegend() + ggtitle("CD8A - Unintegrated Clusters")p2 <- VlnPlot( obj, "rna_CD8A", group.by = "cca_clusters") + NoLegend() + ggtitle("CD8A - CCA Clusters")p3 <- VlnPlot( obj, "rna_CD8A", group.by = "rpca_clusters") + NoLegend() + ggtitle("CD8A - RPCA Clusters")p1 | p2 | p3
图片
(2)DimPlot 图p4 <- DimPlot(obj, reduction = "umap.unintegrated", group.by = c("cca_clusters"))p5 <- DimPlot(obj, reduction = "umap.rpca", group.by = c("cca_clusters"))p6 <- DimPlot(obj, reduction = "umap.cca", group.by = c("cca_clusters"))p4 | p5 | p6
图片
证据以上的信息详情最终使用的去批次范例。三 FindMarker 分析
详情去批次范例后,就不错进行FindMarker 以及审视。1,rejoin layer要预防面前的layer是证据stim批次拆分开的,在进行任何的differential expression analysis之前皆要先使用JoinLayers函数进行rejoin the layers 。objobj2 <- JoinLayers(obj) #仅为了分别,实验情况下使用obj即可obj2
图片
接下来即是DEG分析,找到各个cluster的marekr基因进行手动审视或者径直使用singleR等自动审视软件完成审视。参考贵府:https://satijalab.org/seurat/articles/seurat5_integrationhttps://satijalab.org/seurat/articles/integration_introduction◆ ◆ ◆ ◆ ◆
尽心整理(含图PLUS版)|R话语生信分析,可视化(R统计,ggplot2画图,生信图形可视化汇总)
RNAseq纯生信挖掘念念路共享?不【VNDS-3250】闅c伄鍥e湴濡?鎬ф銉犮儵銉犮儵 娣倝銇叞浣裤亜,主若是送你代码!(提倡储藏)
本站仅提供存储职业,通盘内容均由用户发布,如发现存害或侵权内容,请点击举报。