I believe I’ve finally come up with the best solution. An orphan page called genesis_1_data. This page isn’t really an orphan, since all of them will link to each other (with next and back links, perhaps, links to TOC and chapter, etc).
The point is, that normally users won’t go to these pages, but to some pages created out of them. The trick here is, the DokuWiki page will be created out of the data page(!!) and not the other way around.
This appears to solve all the problems I was experiencing before about parsing RefNotes, etc.
One, we get a data page which is the source document, which is designed with a custom markup especially suited for our reader. Lines would look something like this;
bookname Genesis booknum 1 1.1.1.vtext in the beginning[1a]... (this would be a verse-text line) 1.1.1.hebrew (Hebrew here) 1.1.1.note.1a (named notes auto linking to square bracketed notes, see vtext above) 1.1.1.cr Isaiah 1:1 1.1.1.rashi full text of a Rashi block here 1.1.2.vtext etc[2a] 1.1.2.note.2a etc
As you can see, this is more of a datafile designed to be parsed; but it avoids certain problems, like having to retain headings and apply them to the next verse; instead, it simply produces, in order, the heading before the verse (which is easier to produce a DokuWiki document from, rather than going in the reverse and having to build a state machine to remember headings, verse numbers, notes etc.)
is this the perfect solution? I’m impressed with myself, actually. This is going to be dead simple to parse; the PHP script will be able to refresh DokuWiki pages from this automatically, and they will look nice in DokuWiki. Named notes with RefNote? Wow, auto-done. During main editing the commentary and Rashi, etc. can be out-of-order, same with cross-references, which is better for me. And as for formatting, which was a looming issue since some bibles have particular formatting as part of their schtick, the full formatting will be taken until end-of-paragraph.
More power, Less verbose
Something a bit more powerful, and less verbose, would be to improve the above by using state. Since we saw a “bookname Genesis” we are using state to not have to type “Genesis” on each line. What about some kind of format as follows, using “dot commands” to establish state?
.bookname Genesis .chapter 1 .booknum 1 .chaptername Genesis 1 .chapterheading The first book of Moses, called Genesis. .heading in the beginning 1 In the beginning[1a]... 1a this is a note 2 verse two 3 verse three 3a comment out of order 2a another comment
Something such as the above could also be read line by line, but each line would be processed by the state of the program it was in. This would be handled by reading each line and then jumping to a function which would read the next line; i.e. if we get a .bookname command we would go to the bookname function to process it’s text into the state machine (and produce, say, a == Genesis 1 tag). So in this case, a tag could be [a] or [1a], it wouldn’t matter, as the number would be superfluous on the note, but not on the note’s definition below. In this case we still need to remember the heading and attach it to verse 1. But that is really a trivial part of the state machine.
The first format idea or the second? We really do not need to put the book number on every line, nor the chapter number; so the second format in that sense is really very close to the first. Then, lines beginning with numbers are verses, and lines which do not begin with verses will be assumed to begin with a keyword (or will generate an error). Comment lines can begin with // or # as expected.
This looks really promising.
How would Rashi, then, for example, be defined?
rash 1 Rashi text block for verse 1 1 additional Rashi text block for verse 1 2 text block for verse 2
The Rashi keyword would send the program state into the Rashi reader, which would be a function that would read until it hit a command it didn’t recognize, then exit.
I suppose such functions would return a position within a textile for the main reader to resume. So when it encountered a command word it didn’t know (i.e. an unexpected command word, or fin, etc) it would return and give the position for the command word (or the command word after fin) for the main loop to read and find another handler.
Needless to say, producing and saving a DokuWiki markup version of one of these files would be a trivial task.
I’ll also quickly look at cross refs.
.crossref 1 Isaiah 1:1 1 Isaiah 2:3 Isaiah 7:14 2 Isaiah 1:1, Isaiah 2:2, Isaiah 3:3 .
Just trying it on for size (the dots) leads me to think that a dot in front of a keyword would be the cause of an end-of-section. So if a keyword was read which contained a dot it would mean “new handler function”.
Ok so here’s my feeling for the format; page name would be genesis_1_data
.chapter Genesis 1 title The First book of Moses, called Genesis .vtext heading In the Beginning 1 In the beginning[1a]... 2 verse 3 verse .notes 1a note here 2a note here 4a note here 4b note here (these will look like a, b, c, d in the render, and will be linked) .crossref 1 Isaiah 1:1 2 Isaiah 2:2, Isa.3:3, Job 3:3, Job.4:4, Job.5:5 2 Psa. 6:1 .rashi 1 rashi textblock 1 another
Of course each one will probably be on it’s own line, to keep things clean for DokuWiki, not that it will matter much.
This looks good now.