TrueBase is a new kind of database designed for modeling complex systems. TrueBase is like Wikipedia but fully computable—the focus is on comparable data, not narratives. A TrueBase is a collection of files (concepts which adhere to column schemas) that compile to a single table ("the model") that can be used to answer questions in a data backed way. TrueBase is a way to build an "expert system" out of simple parts. A TrueBase model is fully human readable and auditable—there are no black box parameters.
TrueBase will benefit greatly from LLMs. First, they will provide a powerful and easy new way to query a TrueBase using natural language. Second, they will help experts building TrueBases add data, identify and fix mistakes, and even help improve the schemas of a TrueBase.
TrueBases may also help solve the "hallucination problem" in LLMs by providing trustworthy data that can be queried and referenced when answering prompts.
Modeling complex systems is hard. In addition to just dealing with complexity one has to worry about propaganda, advertisements, paywalls, trackers, licenses, proprietary formats and so forth. TrueBase is designed to be a simply way to slowly build complex models while making no compromises on truth.
The cutting-edge option is to work from the demo PlanetsDB TrueBase:
git clone https://github.com/breck7/truebase cd truebase npm install . npm run local
Most people will want to use the Getting Started Guide. We are currently beta testing the Getting Started Guide. If you'd like to join the waitlist, please email firstname.lastname@example.org.
TrueBase helps people build the smallest expert models that can truthfully answer the most valuable questions.
TrueBase is very minimal. It's built on Tree Notation, a syntax-free notation that can support any data structure. The focus is on structured data (with a small amount of unstructured fields to help data editors). Because of this TrueBase allows for expert models that weigh-in at megabytes, not terabytes.
Traditional encyclopedias are weakly linked articles about things. Question sites are weakly typed narrative answers to user questions. TrueBase is different.
TrueBase is focused on computable structured data. We call this the "triangle of truth": Concepts, Questions and Models.
First we start adding concepts to the database. Each concept is a single file about a thing. But it's not enough just to just list and describe the things.
Structured questions are what make a TrueBase different from a Wiki. A typical TrueBase will have hundreds, to thousands, to tens of thousands of questions.
Our critical question bet is that to really reveal the truth it's just as critical to identify what we don't know and/or what's been purposefully left out. One byte of structured data can be worth more than a billion words of narrative. Especially in expert datasets with no easily repeatable results this is very important. For example, a database of academic studies might be harmful without a question indicating who paid for a study.
So in a TrueBase, it's just as easy to add a question as it is to add a concept. Sometimes adding a new question can completely flip the model or change the certainty of a model, more so than adding more answers to existing questions.
Models are the fun part. Once we have a lot of data, you can use that to build your own models about how things work.
Interconnectedness of the data is very important. When you put the effort in to integrate disparate data into one model, the resulting model is worth more than the sum of its parts. Not only does it become more useful because it's easier to compute over, but it also becomes increasingly difficult to lie as the size of a TrueBase increases due to the increasing number of logical constraints that the data would have to violate. A big TrueBase is hard to vary.
In TrueBase, you store your data in plain text files. This means your data is readily accessible—you can even view and edit it by hand.
You put your data in Tree Notation form. This means your data is all signal—no noise. This ensures you've minimized your data and made it as clean as possible. This also helps make it timeless as no matter what format you need your data in the future it will be preserved in the simplest form possible. You will never regret putting your data in a TrueBase.
A growing ecosystem of tooling makes it easy to augment your TrueBase with data from Large Language Models, web crawlers, and APIs, and run integrity checks, steadily making your TrueBase truer and truer.
You write your TrueBase schemas using the Grammar Language (a Tree Language) which enforces correctness, autofixes errors, and gives you tooling like autocomplete and syntax highlighting.
You can query your TrueBase using TQL (also a Tree Language).
You can display your data using Scroll (also a Tree Language).
So there are many pieces to the TrueBase system, but really just one thing to learn: Tree Notation. Your data, your query language, your schemas, your display language are all in these simple plain text Tree Languages.
Yes. TrueBase is public domain and is designed for public domain databases.
For large databases, TrueBase requires fast computers and fast SSD hard drives. TrueBase was not possible before the Apple M1s, which shipped in December 2020. Here is a post about early unsuccessful attempts at using TrueBase before Apple M1s.