Abstract Poetry: Difference between revisions

← Older edit

Abstract Poetry (view source)

Revision as of 13:44, 31 October 2022

3,771 bytes added , 13:44, 31 October 2022

Added some documentation to make the project understandable

EllieDeSota

6

edits

@@ Line 1: / Line 1: @@
+<nowiki>**</nowiki>Hey hey! The main idea of Abstract Poetry is down below - it's a bit more pitch focused because we wrote it for an application, but hopefully it helps make what we've done so far clear and some of the limitations so anyone can know how/if this can be useful for whatever we end up collaborating on. Cheers!{{Project
-{{Project
 |Homepage=abstract-poetry.fly.dev
 |Description=What if search didn’t stop at keywords? Abstract Poetry hopes to facilitate exhaustive search without relying on matching exact keywords to papers. By focusing on a paper by paper search process that learns the types of paper's you're most interested in, we hope to eliminate the need for biasing algorithms and create a faster search process with a less overwhelming search interface. Right now - we want to make this useful! Could we integrate it with IPFS to begin a semantic decentralized search platform? What features might make this a tool that makes researcher's lives easier? How might the interface help aggregate and create more robust and checkable links to evidence? If you have thoughts let us know!
 |Repository URL=https://github.com/curl-projects/abstract-poetry
 }}
+== What’s our Main idea ==
+Academic search is currently a giant spreadsheet which associates every academic paper with a set of keywords and metadata and waits for papers to be called upon by researchers.
+This data structure asks researchers to build a mental model of all the relevant research in their sub-field and associate that research with the key metadata required to access it. Holding lots of arbitrary information in our heads is something that we as humans aren’t very good at, so it takes years in one field for researchers to consistently do this well.
+To make this process easier, existing search platforms try to bring the most important results to the top of your search results. But computers aren’t very good at this. Most metrics of importance are actually metrics of popularity, which systematically biases search towards ‘hot topics’ and well-marketed research.
+Abstract Poetry flips this distribution of work. We ask humans to do what they’re good at: setting criteria for search determining what’s important to them. And we ask computers to do what they’re good at: storing, processing and filtering millions of data points based on those criteria.
+We’ve done this  in two ways:
+* '''We’ve rethought the interface for search'''  In our search, academics tell us how relevant each returned result is which dynamically improves their future results.  To represent their exploration, we return an interactive visualization of the connection between the researcher’s preferred results and their disliked ones. This gives them the opportunity situate their preferred research within the context of all the results, relevant and irrelevant, within the domain.
+* '''We’ve given the computer a way to understand and categorize a continuous map of science.'''  We’ve partnered with Semantic Scholar, who has given us access to a semantic embedding database which holds 768-dimensional semantic embeddings for 140 million papers. Collectively, these vectors create a high-dimensional map of science that represents scientific domains in terms of how similar papers are to each other.  We explore this map of science using a Bayesian multi-armed bandit algorithm called Thompson sampling, which uses the researcher preferences (”More Like This/Less Like This”) to identify regions of the embeddings space that the researcher is currently interested in.
+In the last two months, we have taken Abstract Poetry from an idea to two workable products (a general search and an interactive bibliography), completed 40 user discovery and product test interviews , and developed a set of internal systems which help us turn user problems into updates in our product.
+What we’ve done so far is promising, but limited.
+# '''PLOS is only 0.1% of science.'''  This limits our current search and prevents many researchers from being able to get quality results from our tool.
+# '''We are limited to Semantic Scholar’s semantic embeddings.'''  From user interviews, we’ve learned that knowing contrasting papers and the types of similarity between papers such as methodology, claims, or topic would be essential to improving the search experience. Semantic Scholar’s embeddings are not trained to provide this level of detail.
+# '''We have not documented our model.'''  Several of us work multiple jobs, meaning that the essential work of documenting the work we’ve done has been pushed to the side.