I would guess the training data (conversational as opposed to coding specific solutions) is weighted towards people finding errors in others work, more than people discussing errors in their own. If you knew there was an error in your thinking, you probably wouldn't think that way.
Part of my vision is to also help revolutionize adjacent healthcare fields, as we're focused on premed, medical and dental currently for undergrads + recent medical graduates, but nursing, pharmacy, physiotherapy and veterinary school all are of great interest to me. I am a big animal lover personally.
For such markets, you can imagine that the TAM etc is smaller, but still important. For us it's a blend of mission driven and business.
Thanks for the comment! I would love to chat vet-ed-tech further, I am on LinkedIn (/in/az1b) or email: azib [at] az1b [dot] com
I've built something along these lines. It utilises OCR to extract text content, indexes it for RAG, uses a separate service to identify/match concepts to reference data in an RDF knowledge graph, and displays the original source documents with the references to KG concepts overlayed.
Breaking a problem down into smaller problems, solving those that are immediately obvious or known from experience, for harder or new problems: gathering evidence if available, coming up with a hypothesis, testing this against the available evidence, looking for reasons why the hypothesis must be wrong and abandoning it if reasons are found, iterating on the hypothesis until an adequate one is found (adequate being provably correct, or "sounding sensible" based on solutions to similar problems). My 2c is being ok with uncertainty and being wrong, and an awareness of cognitive biases can be helpful.
I have an application that converts word documents to RDF conformant with the SPAR ontologies (mainly DoCO http://www.sparontologies.net/ontologies/doco), so it contains things like headers, numbering, contains/within relationships explicit in the RDF. I've used it successfully with PDFs by converting to DOCX first. Is this the sort of thing you had in mind? Not here to sell it! I think this is a genuinely interesting unexplored area ..
The PDF format supports attachments (embedded files). I'm thinking about a set of libraries and/or a command-line utility that would make it trivially easy to attach a SQLite|JSON file to a PDF or extract one from a PDF. This won't fix existing files, of course, but at least for those apps that generate PDFs it will be easier to embed a SQLite/JSON into a generated PDF.
This looks awesome! The decision to combine structural and rhetorical ontologies, seems like it optimizes the best between cost and availability, in the sweetSpot of the users actual requirements when working with research and academic documents.