-
Categories
-
Pharmaceutical Intermediates
-
Active Pharmaceutical Ingredients
-
Food Additives
- Industrial Coatings
- Agrochemicals
- Dyes and Pigments
- Surfactant
- Flavors and Fragrances
- Chemical Reagents
- Catalyst and Auxiliary
- Natural Products
- Inorganic Chemistry
-
Organic Chemistry
-
Biochemical Engineering
- Analytical Chemistry
-
Cosmetic Ingredient
- Water Treatment Chemical
-
Pharmaceutical Intermediates
Promotion
ECHEMI Mall
Wholesale
Weekly Price
Exhibition
News
-
Trade Service
Biological processes in cells involve DNA, RNA, proteins and other different levels of regulation, which influence each other and work together, so integrating multimodal information corresponding to different omics data is the premise and key to comprehensively characterizing cell physiological/pathological states2
.
In recent years, the development of single-cell multiomics technology has allowed biologists to measure different modal/omics information (SHARE-seq3, Sci-Car4, InCite-seq5, 10X multiome) in one cell at the same time, and with the understanding of different modalities of the same system, the understanding of important life processes, such as disease, embryonic development, can be further deepened 6–8
。 However, compared with previous single-omics techniques, these multiomics techniques are more difficult to apply, more expensive, and the quality of the data obtained is poorer
.
Therefore, the development of a computational method to use these single-cell multiomics data as surveillance signals, and the integration of the large amount of high-quality single-modal data currently available, will be of great help to this field (Figure 1)9
.
Figure 1 Cross-modal characterization learning in single-cell omics research
In response to this problem, the team of researchers Gao Ge of Peking University/Changping Laboratory proposed a cross-linked Unified Embedding learning framework under cross-modal representation learning1, and the related papers were accepted by the top conference in the field of artificial intelligence NeurIPS 2022, and were invited to give oral presentations, and the relevant papers and codes have been open sourced
.
A common paradigm for single-cell multimodal data integration is to project data from different feature spaces into low-dimensional space through encoders unique to each modality, and then integrate modal-specific low-dimensional representations by learning alignment methods using pairwise supervised signals from multiomics techniques
.
But these methods have a common limitation, they do not take into account that the resolution between different modalities is different, such as immune cells have a more detailed characterization of the surface protein modality, but the difference in overall gene expression is relatively small
.
Therefore, during integration, the low-resolution gene expression space affects the high-resolution protein space, thereby losing information about the specificity of these modalities
.
In other words, these different modalities will hinder each other, rather than promote
together.
In order to solve this problem, CLUE introduces modal-specific representation subspaces, and has a corresponding subspace for each mode to learn the information of the corresponding modals, thereby eliminating the mutual limitations
caused by different resolutions between different modes.
At the same time, CLUE further uses self-encoders for different modalities to learn the original information in a single modality, and uses cross-encoder to learn the information between different modalities, and then integrates these representations from different modalities through mapping between multimodalities (Figure 2).
Figure 2 Schematic diagram of the CLUE model framework
In addition, CLUE also introduces Adversarial learning to eliminate representation differences between different modalities, and optimizes the mean square error of paired multimodal representations with the help of supervised signals from multiomics, thereby further improving the accuracy of
integration.
In the first NeurIPS Multimodal Single Cell Data Integration Competition, CLUE won the first place in cross-modal integration in all integration categories, including single-cell chromatin open group/transcriptome/surface proteome (Figure 3)10
.
At the same time, CLUE also achieved the best performance
in the comparison of integration methods such as MultiVI, Cobolt, and Bridge-integration that are not yet in the competition.
The relevant model of CLUE in single-cell multiomics has been integrated into the Python-based open source software package GLUE (https://github.
com/gao-lab/GLUE)11 previously developed by Gao Ge's group
.
It is worth noting that the design of CLUE is not limited to single-cell multiomics data, and in principle can be extended to various modal fields
such as image/text/audio.
Figure 3 Results of CLUE integration on single-cell chromatin open group, transcriptome, surface proteome
Tu Xinming, an undergraduate student at Peking University's School of Life Sciences (currently a doctoral student at the University of Washington), Dr.
Cao Zhijie, a postdoctoral fellow at Peking University, is the co-first author of the paper, Xia Chenrui, a graduate student of Peking University, is the corresponding author of this paper, and Tu Xinming's current supervisor, Professor Sara Mostafavi of the University of Washington, is the co-corresponding author
of the paper.
The research was supported
by the National Key Research and Development Program, the State Key Laboratory of Protein and Plant Gene Research, the Beijing Future Genetic Diagnosis Advanced Innovation Center and Changping Laboratory.
Open source code: https://github.
com/gao-lab/GLUE
Full text: https://openreview.
net/pdf?id="Tfb73TeKnJ-
1.
Tu, X*.
, Zhijie-Cao*, Xia, C.
, Mostafavi, S.
& Gao, G.
Cross-Linked Unified Embedding for cross-modality representation learning.
in 36th Conference on Neural Information Processing Systems (NeurIPS 2022)
2.
Stuart, T.
, Butler, A.
, Hoffman, P.
, Hafemeister, C.
, Papalexi, E.
, Mauck, W.
M.
, Hao, Y.
, Stoeckius, M.
, Smibert, P.
& Satija, R.
Comprehensive Integration of Single-Cell Data.
Cell177, (2019).
3.
Ma, S.
, Zhang, B.
, LaFave, L.
M.
, Earl, A.
S.
, Chiang, Z.
, Hu, Y.
, Ding, J.
, Brack, A.
, Kartha, V.
K.
, Tay, T.
, Law, T.
, Lareau, C.
, Hsu, Y.
-C.
, Regev, A.
& Buenrostro, J.
D.
Chromatin Potential Identified by Shared Single-Cell Profiling of RNA and Chromatin.
Cell183, 1103-1116.
e20 (2020).
4.
Cao, J.
, Cusanovich, D.
A.
, Ramani, V.
, Aghamirzaie, D.
, Pliner, H.
A.
, Hill, A.
J.
, Daza, R.
M.
, McFaline-Figueroa, J.
L.
, Packer, J.
S.
, Christiansen, L.
, Steemers, F.
J.
, Adey, A.
C.
, Trapnell, C.
& Shendure, J.
Joint profiling of chromatin accessibility and gene expression in thousands of single cells.
Science361, 1380 1385 (2018).
5.
Chung, H.
, Parkhurst, C.
N.
, Magee, E.
M.
, Phillips, D.
, Habibi, E.
, Chen, F.
, Yeung, B.
Z.
, Waldman, J.
, Artis, D.
& Regev, A.
Joint single-cell measurements of nuclear proteins and RNA in vivo.
Nat Methods18, 1204–1212 (2021).
6.
Janssens, J.
, Aibar, S.
, Taskiran, I.
I.
, Ismail, J.
N.
, Gomez, A.
E.
, Aughey, G.
, Spanier, K.
I.
, Rop, F.
V.
D.
, González-Blas, C.
B.
, Dionne, M.
, Grimes, K.
, Quan, X.
J.
, Papasokrati, D.
, Hulselmans, G.
, Makhzami, S.
, Waegeneer, M.
D.
, Christiaens, V.
, Southall, T.
& Aerts, S.
Decoding gene regulation in the fly brain.
Nature 1–7 (2022).
doi:10.
1038/s41586-021-04262-z
7.
Argelaguet, R.
, Clark, S.
J.
, Mohammed, H.
, Stapel, L.
C.
, Krueger, C.
, Kapourani, C.
-A.
, Imaz-Rosshandler, I.
, Lohoff, T.
, Xiang, Y.
, Hanna, C.
W.
, Smallwood, S.
, Ibarra-Soria, X.
, Buettner, F.
, Sanguinetti, G.
, Xie, W.
, Krueger, F.
, G?ttgens, B.
, Rugg-Gunn, P.
J.
, Kelsey, G.
, Dean, W.
, Nichols, J.
, Stegle, O.
, Marioni, J.
C.
& Reik, W.
Multi-omics profiling of mouse gastrulation at single-cell resolution.
Nature576, 487–491 (2019).
8.
Welch, J.
D.
, Kozareva, V.
, Ferreira, A.
, Vanderburg, C.
, Martin, C.
& Macosko, E.
Z.
Single-Cell Multi-omic Integration Compares and Contrasts Features of Brain Cell Identity.
Cell177, (2019).
9.
Argelaguet, R.
, Cuomo, A.
S.
E.
, Stegle, O.
& Marioni, J.
C.
Computational principles and challenges in single-cell data integration.
Nat Biotechnol39, 1202–1215 (2021).
10.
Lance, C.
, Luecken, M.
D.
, Burkhardt, D.
B.
, Cannoodt, R.
, Rautenstrauch, P.
, Laddach, A.
, Ubingazhibov, A.
, Cao, Z.
-J.
, Deng, K.
, Khan, S.
, Liu, Q.
, Russkikh, N.
, Ryazantsev, G.
, Ohler, U.
, participants, N.
2021 M.
data integration competition, Pisco, A.
O.
, Bloom, J.
, Krishnaswamy, S.
& Theis, F.
J.
Multimodal single cell data integration challenge: results and lessons learned.
Biorxiv 2022.
04.
11.
487796 (2022).
doi:10.
1101/2022.
04.
11.
487796
11.
Cao, Z.
-J.
& Gao, G.
Multi-omics single-cell data integration and regulatory inference with graph-linked embedding.
Nat Biotechnol 1–9 (2022).
doi:10.
1038/s41587-022-01284-4