Package org.apache.lucene.analysis.cjk
Class CJKWidthCharFilter
java.lang.Object
java.io.Reader
org.apache.lucene.analysis.CharFilter
org.apache.lucene.analysis.charfilter.BaseCharFilter
org.apache.lucene.analysis.cjk.CJKWidthCharFilter
- All Implemented Interfaces:
Closeable
,AutoCloseable
,Readable
A
CharFilter
that normalizes CJK width differences:
- Folds fullwidth ASCII variants into the equivalent basic latin
- Folds halfwidth Katakana variants into the equivalent kana
NOTE: this char filter is the exact counterpart of CJKWidthFilter
.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final int
private static final int
private int
private static final byte[]
private static final byte[]
private static final char[]
private int
Fields inherited from class org.apache.lucene.analysis.CharFilter
input
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate int
combineVoiceMark
(int ch, int voiceMark) returns combined char if we successfully combined the voice mark, otherwise original charint
read()
int
read
(char[] cbuf, int off, int len) Methods inherited from class org.apache.lucene.analysis.charfilter.BaseCharFilter
addOffCorrectMap, correct, getLastCumulativeDiff
Methods inherited from class org.apache.lucene.analysis.CharFilter
close, correctOffset
Methods inherited from class java.io.Reader
mark, markSupported, nullReader, read, read, ready, reset, skip, transferTo
-
Field Details
-
KANA_NORM
private static final char[] KANA_NORM -
KANA_COMBINE_VOICED
private static final byte[] KANA_COMBINE_VOICED -
KANA_COMBINE_SEMI_VOICED
private static final byte[] KANA_COMBINE_SEMI_VOICED -
HW_KATAKANA_VOICED_MARK
private static final int HW_KATAKANA_VOICED_MARK- See Also:
-
HW_KATAKANA_SEMI_VOICED_MARK
private static final int HW_KATAKANA_SEMI_VOICED_MARK- See Also:
-
prevChar
private int prevChar -
inputOff
private int inputOff
-
-
Constructor Details
-
CJKWidthCharFilter
Default constructor that takes aReader
.
-
-
Method Details
-
read
- Overrides:
read
in classReader
- Throws:
IOException
-
combineVoiceMark
private int combineVoiceMark(int ch, int voiceMark) returns combined char if we successfully combined the voice mark, otherwise original char -
read
- Specified by:
read
in classReader
- Throws:
IOException
-