Computable Graphs: Difference between revisions

From Synthesis Infrastructures
(Flesh out 'compiling graphs to manuscripts' project)
Line 31: Line 31:
* Belinda: what is scope of the project in terms of users?
* Belinda: what is scope of the project in terms of users?
** Matt: hoping this accessible all the way to high school students also!
** Matt: hoping this accessible all the way to high school students also!
* AI2 has platform for running simulations, make it easy to run and share with others
* AICS is making platform for running simulations, make it easy to run and share with others - [[Simularium]]
* Dafna: are there pain points on the input side? (seems painful!)
* Dafna: are there pain points on the input side? (seems painful!)
* Dafna: how similar are the things that people pull out from the same paper?
* Dafna: how similar are the things that people pull out from the same paper?
Line 59: Line 59:


* idea of compiling from discourse graph to manuscript seems feasible?
* idea of compiling from discourse graph to manuscript seems feasible?
[[File:Discourse graph to manuscript draft via NLP.png|900x900px]]
** Belinda could mock something up in OpenAI playground really quickly for a few papers
** Belinda could mock something up in OpenAI playground really quickly for a few papers
*** need examples or access to repo
*** need examples or access to repo
Line 76: Line 75:


== Compiling graphs to manuscripts ==
== Compiling graphs to manuscripts ==
=== Overview ===
We'd like to build a prototype of a tool that starts with discourse units (phrases with the category Question/motivation, Method, Evidence, Claim) and uses NLP to generate a draft of a manuscript paragraph.
[[File:Discourse graph to manuscript draft via NLP.png|900x900px]]
=== Purpose ===
Such a tool would speed up a really time consuming aspect of the academic job: drafting grant proposals and manuscripts from content. It would also encourage researchers to generate structured content (questions/claims/evidence) which will get incorporated into a discourse graph.
The benefit to the user is a structured approach to writing papers and grant proposals. Using the tool will introduce researchers to the concept of discourse units, and generate a repository of discourse units that can be turned into discourse graphs.
=== Ideal outcomes ===
(more to add here)
This tool could also enable the micropublication of mini discourse graphs (one question/method/evidence/claim) by generating a draft of the explanatory text.
* Could also integrate into Roam Research graphs, e.g. in Matt's lab, via the Roam GPT-3 extension
=== What we're doing next ===
* Belinda could mock something up in OpenAI playground really quickly for a few papers
** need examples or access to repo
* Matt and Michael will share some examples of papers or content from the discourse graph to use as source data
* Sid can check out [https://github.com/LayBacc/roam-ai  Roam AI extension] for incorporation into our roam graph workflow (fork, make modifications etc)
** ([https://www.reddit.com/r/RoamResearch/comments/yigf6q/im_genuinely_fascinated_with_the_roam_ai/ discussion on its usage])
** [https://www.loom.com/share/d152e7a184f94080b8777f595821f43e usage video]
=== Related conversation ===
* Dafna: "compiling" discourse graph to manuscript seems much easier, esp. if have consistent structure and human-in-the-loop
** Similar to brainstorming discussion/ethics statements for a paper given abstract (via GPT-3
* Belinda: how much variation in paper structure within your field?
** some variations by journal
=== Claims in the conversation that need evidence ===
* the majority of empirical research papers in biology have a similar structure (question/ motivation/ evidence (fig.1a)/ claim for each paragraph & figure panel)
* multiple researchers (or students) asked to highlight the questions/claims/evidence text from a paper will highlight similar/consensus text (part of the NLP-to-highlights project)


== Next Steps ==
== Next Steps ==

Revision as of 18:27, 12 November 2022

Computable Graphs
Description How to ground knowledge graphs (that can be used for prediction or computational simulation experiments and models) in the discourse and quantitative evidence in scientific literature?
Related Topics Knowledge Graphs
Projects Synthesis center for cell biology, Translate Logseq Knowledge Graph to Systems Biology Network Diagrams
Discord Channel #computable-graphs
Facilitator
Members Aakanksha Naik, Dafna Shahaf, Belinda Mo, Akila Wijerathna-Yapa, Matthew Akamatsu, Michael Gartner, Joel Chan

Facilitator/Point of Contact: Joel Chan

What

How to ground knowledge graphs (that can be used for prediction or computational simulation experiments and models) in the discourse of evidence in scientific literature? How to transition from unstructured literature to knowledge graphs and keep things updated with appropriate provenance for (un)certainty?

Discussion entry points

Resources

First breakout group session

who was present: Joel, Matt, Michael, Sid, Aakanksha, Belinda

  • Dafna: very much like the old issue-based argument maps!
  • Belinda: what is scope of the project in terms of users?
    • Matt: hoping this accessible all the way to high school students also!
  • AICS is making platform for running simulations, make it easy to run and share with others - Simularium
  • Dafna: are there pain points on the input side? (seems painful!)
  • Dafna: how similar are the things that people pull out from the same paper?
    • Matt: for our field, pretty similar, esp. for well-written papers
    • Belinda: compare what each person is highlighting and extract summary that is representative of all the annotations
  • Dafna: how useful are these (bits of) knowledge graphs for others?
    • Works well within lab; better than unstructured text, motivating to try to create micropublications to summarize outcome 10-week rotation
  • Dafna: "compiling" discourse graph to manuscript seems much easier, esp. if have consistent structure and human-in-the-loop
    • Similar to brainstorming discussion/ethics statements for a paper given abstract (via GPT-3
  • Belinda: how much variation in paper structure within your field?
    • some variations by journal
    • some authors (think more highly of themselves), more declarative/general, less clear distinction between claims and evidence
  • Aakanksha:
    • hypothes.is experiment
    • need infra changes
      • help make the case for these changes
        • maybe how much $$ each person would pay for this!!)
        • demonstration of value
        • brainstorming what research projects would be part of this

emerging themes/problems:

  • idea around changing the reading process somehow (with high hopes for somehting like semantic scholar PD reader that has beginning annotations tuned to what matt is trying to extract, maybe also on the abstract level) --> these could also feed back / forward to other users
    • can probably start from this: Scim: Intelligent Faceted Highlights for Interactive, Multi-Pass Skimming of Scientific Papers https://arxiv.org/pdf/2205.04561.pdf
    • cross connections to what
  • idea of compiling from discourse graph to manuscript seems feasible?
    • Belinda could mock something up in OpenAI playground really quickly for a few papers
      • need examples or access to repo
    • Could also integrate into Matt's lab via GPT-3 extension
  • question: understanding different levels of value of having a knowledge graph for someone else who didn't create it
    • --> Joel can add links to ongoing lit review on this question - not resolved yet
  • theme/idea: "compiling" from discourse to knowledge graphs
    • Aakanksha: often see ontologies in isolation
      • Aakanksha: can look into NLP around adding context to knowledge graphs
    • Michael: interesting to think about the user experience on this - how do they interact?
    • Grounding abstractions: https://www.susielu.com/data-viz/abstractions

Contextualizing knowledge graphs

Understanding knowledge graph transfer

Compiling graphs to manuscripts

Overview

We'd like to build a prototype of a tool that starts with discourse units (phrases with the category Question/motivation, Method, Evidence, Claim) and uses NLP to generate a draft of a manuscript paragraph. Discourse graph to manuscript draft via NLP.png

Purpose

Such a tool would speed up a really time consuming aspect of the academic job: drafting grant proposals and manuscripts from content. It would also encourage researchers to generate structured content (questions/claims/evidence) which will get incorporated into a discourse graph.

The benefit to the user is a structured approach to writing papers and grant proposals. Using the tool will introduce researchers to the concept of discourse units, and generate a repository of discourse units that can be turned into discourse graphs.

Ideal outcomes

(more to add here)

This tool could also enable the micropublication of mini discourse graphs (one question/method/evidence/claim) by generating a draft of the explanatory text.

  • Could also integrate into Roam Research graphs, e.g. in Matt's lab, via the Roam GPT-3 extension

What we're doing next

  • Belinda could mock something up in OpenAI playground really quickly for a few papers
    • need examples or access to repo
  • Matt and Michael will share some examples of papers or content from the discourse graph to use as source data
  • Sid can check out Roam AI extension for incorporation into our roam graph workflow (fork, make modifications etc)

Related conversation

  • Dafna: "compiling" discourse graph to manuscript seems much easier, esp. if have consistent structure and human-in-the-loop
    • Similar to brainstorming discussion/ethics statements for a paper given abstract (via GPT-3
  • Belinda: how much variation in paper structure within your field?
    • some variations by journal

Claims in the conversation that need evidence

  • the majority of empirical research papers in biology have a similar structure (question/ motivation/ evidence (fig.1a)/ claim for each paragraph & figure panel)
  • multiple researchers (or students) asked to highlight the questions/claims/evidence text from a paper will highlight similar/consensus text (part of the NLP-to-highlights project)

Next Steps

  • don't know if a joint project makes sense, but perhaps coordinated first prototypes of a bridge?

could use:

  • someone with programming skills to implement a POC translation between a discourse graph and one of the specific modeling languages/ontologies