Ignoring some tags for LaTeX documents

Hello,

I’m trying to spell check a LaTeX document. I think I can ignore certain predefined Latex words using
((SpellingCheckRule)activeRule).addIgnoreTokens(ignoreWords), but I’m not sure how to:

  1. Ignore spelling of equations, i.e. ignore all words between $, or between $$
  2. Ignore spelling lines that start with the comment line “%”. For example, lines with “%%%%” will still be checked by the LanguageTool.

So, the question is, can I also ignore words which have some special characters, or some pattern?

best, Sergei

How to ignore parts of a text is documented - but a bit hidden - in our change log:

A new method JLanguageTool.check(AnnotatedText) has been introduced that allows you to check text with markup. Use AnnotatedTextBuilder to build up the input.

So you don’t need to define words, but you need to tell LT what parts are plain text and which are not. Does that help?

Hello, Thanks,

I’ve looked at the API of this class. I think I have no idea how to stop to spell everything between $ and any keyword that starts from “”, i.e. how to apply this class for a sentences like:

This is an equation \sqrt{s} and this is a citation \cite{}.

(I do not want to spell .. and “\cite”)

best, Sergei

AnnotatedTextBuilder cannot really help you with splitting the text, you will need to do that yourself. For example, by using regular expressions or Java’s StringTokenizer. Once you have the parts you can feed them into AnnotatedTextBuilder. It would be nice if you could then publish your class as Open Source so others don’t have to re-invent the wheel.

Ok, I’ve found a simple solution: replace everything between and with empty string and disable WHITE_SPACE rule.

/**
* Empty string of length len
* @param leng
* @return
*/
public static String padString(int leng) {
StringBuffer str= new StringBuffer();;
for (int i =0; i <leng; i++) str.append(" ");
return str.toString();
}

/**
 * Replace latex tags preserving document structure
 * @param text
 * @return
 */
public static String modifyLatexTags(String text) {
    
	// replace between $ and $
	Pattern p = Pattern.compile("\\$[^\\$]*\\$");
	Matcher m = p.matcher(text);
	while(m.find()){
	    String d = text.substring(m.start(), m.end());
	    String empty=padString(d.length());
	    text = text.replace(d, empty);
	}

	return text;
}

best, Sergei