Jump to content

Synthesis center for cell biology: Difference between revisions

m
no edit summary
(Creating a synthesis center to enable grassroots contributions for conceptual and quantitative models in cell biology.)
mNo edit summary
Line 3: Line 3:
[[File:Conceptual and quantitative models.png|center|500x500px]]
[[File:Conceptual and quantitative models.png|center|500x500px]]


While most of the "big data" cell biology community is focused on creating new data sets, we are proposing to synthesize existing data into quantitative and conceptual models. For one set of quantitative models, we are using large datasets of the locations and interactions of cellular components to train generative (a la DALL-E-2) models of cells. The goal is for these synthetic cells to behave realistically in novel environments. For those models to be predictive, we need to constrain them with 1) quantitative parameter values from the literature and 2) mechanistic and biophysical information about the underlying processes. We need some help with #1 (an NLP challenge). For #2, we are building a platform to make mechanistic biophysical models of cellular processes that are interoperable, modular, and accessible.  But how do we as a field synthesize existing cell biology data into higher-level concepts, models, and theories?  
While most of the "big data" cell biology community is focused on creating new data sets, we are proposing to synthesize existing data into quantitative and conceptual models. For one set of quantitative models, we are using large [https://www.proteinatlas.org/humanproteome/subcellular datasets] of the locations and interactions of cellular components to train generative (a la DALL-E-2) models of cells. The goal is for these synthetic cells to behave realistically in novel environments. For those models to be predictive, we need to constrain them with 1) quantitative parameter values from the literature and 2) mechanistic and biophysical information about the underlying processes. We need some help with #1 (an NLP challenge). For #2, we are building a platform to make mechanistic biophysical models of cellular processes that are interoperable, modular, and accessible.  But how do we as a field synthesize existing cell biology data into higher-level concepts, models, and theories?  


[[File:Quantitative model generation.png|center|350x350px]]
[[File:Quantitative model generation.png|center|350x350px]]


To make conceptual models, we would like to use the power and modularity of the discourse graph schema - Questions, Claims, and Evidence - to structure the state of knowledge for our favorite research question(s).  Furthermore, we'll extend the discourse graph schema to guide our ''ongoing'' research contributions to address these questions. We call these [https://youtu.be/P0KUt2yrUkw results graphs]. Our lab has begun to create discourse and results graphs to track our understanding and contributions to our current research questions. Using Roam Research and Joel Chan's discourse graph extension, we classify a given research Question, collect Evidence from the literature and our lab notebooks, and use them to support Conclusions, which claim to address the research question.  It is early days, but this schema appears to help students structure their thinking, track their progress, and - most importantly - frame their work less as an individual endeavor and more as a contribution to a collective project (i.e. we are all trying to uncover the answer together).
To make conceptual models, we would like to use the power and modularity of the [https://network-goods.notion.site/The-Discourse-Graph-starter-pack-312374c813b24ec6b4d53a054371ee5a discourse graph] schema - Questions, Claims, and Evidence - to structure the state of knowledge for our favorite research question(s).  Furthermore, we'll extend the discourse graph schema to guide our ''ongoing'' research contributions to address these questions. We call these [https://youtu.be/P0KUt2yrUkw results graphs]. Our lab has begun to create discourse and results graphs to track our understanding of and contributions to our current research questions. Using Roam Research and Joel Chan's discourse graph extension, we classify a given research Question, collect Evidence from the literature and our lab notebooks, and use them to support Conclusions, which claim to address the research question.  It is early days, but this modular schema appears to help students structure their thinking, track their progress, and - most importantly - frame their work less as an individual endeavor and more as a contribution to a collective project (i.e. we are all trying to uncover the answer together).


[[File:Purpose and users of cell biology discourse graphs.png|center|350x350px]]
[[File:Purpose and users of cell biology discourse graphs.png|center|350x350px]]
Line 14: Line 14:




With the current tooling, paired with some ease-of-use improvements, and a 'captive audience' in the form of initial users who are also beneficiaries of the synthesis center, we think that discourse and results graphs in cell biology will allow for ''grassroots'' contributions from students, scientists, and community researchers, to build overarching concepts, models, and theories in cell biology.  
With the current tooling, paired with some ease-of-use improvements, and a 'captive audience' in the form of initial users who will also be beneficiaries of the synthesis center, we think that discourse and results graphs in cell biology will allow for ''grassroots'' contributions from students, scientists, and community researchers, to build overarching concepts, models, and theories in cell biology.  


[[File:Progress to theories.png|center|800x800px]]
[[File:Progress to theories.png|center|800x800px]]
14

edits