Sentence-Fragment.zip (20.3 KB)
I have attached a zip file containing a rule that searches for "sentence fragments", plus the full test results, (see below for summary.)
Finding 52 errors in 311 articles may seem excessive; but, while it is only looking for fragment sentences, it touches upon one of the least understood aspects of sentence construction in English: Subordinate conjunctions and their relationship to an independent clause. I am working on a fuller set of rules that, will specifically deal with conjunctions of all types.
Unfortunately, pinning the exact location of an error is nearly impossible; the rule can only say that the error exists. The error message reflects this:
R0.1B: This can be a subtle error: “\2 \3” introduce a subordinate clause; however, there is neither punctuation nor coordinating conjunction to indicate a main clause. Alternatively, if the SC is subordinate to the preceding sentence, the two sentences should be joined. A subordinate conjunction ending a sentence would normally not be punctuated: Though a colon may be used for emphasis. Another possibility, with narrative or rhetorical styles, is a missing question or exclamation mark. Finally, correct punctuation is very sensitive to phrasing.
I hope you find this useful
Summary of results:
"Rule 0.1 Sentence fragments". The test sample was 311 articles from Wikipedia: 11.5 MB excluding footnotes and references.
There are 97 recorded errors, of which 22 were caused by converting tables into pure text, 12 were quotes, (2 of which were deliberate puns,) and 5 were subheadings, 1 error was caused by an abbreviation with an incorrect full stop, (I marked the appropriate link,) and 1 error was caused by the text conversion being unable to handle subscripts, another error was an intentional example. This leaves a total of 55 errors, of these 3 involved mathematical texts: The problem being that it is fairly common, when writing mathematical formula, to spread a sentence over several lines. I suggested bodge solutions, but the truth is, my punctuation rules do not handle this type of notation very well.
Of the other 52 hits, they are all valid punctuation errors, and I have inserted the corrected sentence below the error report. Often, the error is tied up with poor phrasing and sentence construction, (using 'That' instead of 'This' is very common.) Where this happens, I have offered a basic correction, followed by a fuller correction of the entire passage.