Pre-vowel Accents in Chabad’s CTR
The Chabad website has an edition of the Hebrew Bible
called
The
Complete Tanach with Rashi (CTR). (See my
document, “On the Provenance of Chabad’s
CTR.”)
The CTR edition distinguishes three pairs of accents using a
nonstandard mechanism rather than using the code points dedicated to making these distinctions. On a
letter with a code point for a vowel mark (including HOLAM),
CTR distinguishes the following prepositives from their impositive
“lookalikes” by the logical order of an accent code point relative to that vowel:
- The code point TIPEHA means either
deḥi (!) or
tarḥa / tipeḥa,
depending on context.
- Before a vowel, it means deḥi (!).
- After a vowel (the normal order), it means
tarḥa / tipeḥa (the
normal meaning of TIPEHA).
- The code point GERESH means either
geresh muqdam (!) or geresh, depending
on context.
- Before a vowel, it means geresh muqdam (!).
- After a vowel (the normal order), it means geresh (the normal
meaning of GERESH).
- The code point YETIV means either
yetiv or mahapakh (!), depending on
context.
- Before a vowel, it means yetiv (the normal meaning of
YETIV).
- After a vowel (the normal order), it means mahapakh (!).
There are four levels of strangeness here.
- As mentioned above, there are code points dedicated to making these distinctions. These code
points have been around since the introduction of Hebrew accent code points, in Unicode 2.0.
- The accent-before-vowel order will be normalized away in some environments, notably, in most web
browsers.
- The use of YETIV breaks the pattern established by
TIPEHA and GERESH, where the
impositive code point does double duty, and the prepositive code point is not used. If the pattern
were followed, MAHAPAKH rather than
YETIV would be used: MAHAPAKH before
a vowel would mean yetiv (surprising but at least following the
pattern) and MAHAPAKH after a vowel (the normal order) would mean
mahapakh (the normal meaning of
MAHAPAKH).
- No font I am aware of will “understand” this encoding.
In some but not all cases, the logical order of these code points reflects a desired horizontal
visual order. Even when it does reflect a desired visual order, this visual order is very unlikely
to be achieved, except in the case of YETIV before a vowel. In all
other cases, few if any fonts will render the marks in the desired visual order, and in normalizing
contexts like most web browsers, the font won’t even get a chance to try. In detail:
- The code point TIPEHA before a vowel means
deḥi (!), e.g. to encode the segol
and deḥi under the letter he in
הֶ֭חרשתי (Psalm 32:3) (fully:
הֶ֭חֱרַשְׁתִּי ),
- Instead of ‹POINT SEGOL, DEHI›,
CTR uses ‹TIPEHA,
POINT SEGOL›.
- I.e., instead of הֶ+ה֭, CTR uses
ה֖+הֶ. (The plus sign expression indicates a kind of concatenation, and is
meant to be read right to left.)
- This is very unlikely to have the desired appearance in most fonts. Plus, in normalizing
contexts like most web browsers, the font won’t even get a chance to try. In this document’s
context, it will look like this: הֶ֖חרשתי.
- Above I have ignored a געיה that CTR also has under the
he. I have ignored it because (1) it is not relevant to the issue at
hand and (2) it is one of many “extra” געיה marks that CTR has
compared to many other editions.
- The code point TIPEHA after a vowel means
tarḥa / tipeḥa, e.g.
וְאֵ֖ין (Psalm 32:2).
- The code point GERESH before a vowel means
geresh muqdam (!), e.g. to encode the
tsere and geresh muqdam on the letter
alef in אֵ֝ליו (Psalm 32:6) (fully:
אֵ֝לָ֗יו ),
- Instead of ‹TSERE, GERESH
MUQDAM›, CTR uses ‹GERESH,
TSERE›.
- I.e., instead of אֵ+א֝, CTR uses
א֜+אֵ. (The plus sign expression indicates a kind of concatenation, and is
meant to be read right to left.)
- This is very unlikely to have quite the desired appearance in most fonts, though the appearance
will likely be close to what is desired. In normalizing contexts like most web browsers, the font
won’t even get a chance to try. In this document’s context, it will look like this:
אֵ֜ליו. Because the two marks in question are not both below-marks, this looks
pretty close to the desired appearance. But it is still not quite what is desired.
- The code point GERESH after a vowel means
geresh, e.g. הַמַּ֜יִם (Genesis 1:9).
- The code point YETIV before a vowel means
yetiv, e.g. to encode the ḥiriq and
yetiv under the letter kaf in
כִּ֚י (Joshua 2:11),
- Instead of ‹HIRIQ, YETIV›,
CTR uses ‹YETIV,
HIRIQ›.
- I.e., instead of כִּ+כּ֚, CTR uses
כּ֚+כִּ. (The plus sign expression indicates a kind of concatenation, and is
meant to be read right to left.)
- Although this is a strange order to encode it in, this is very likely to have the desired
appearance in most fonts.
- The code point YETIV after a vowel means
mahapakh (!), e.g. to encode the
qamats and mahapakh under the letter
tav in אתָּ֤ה (Psalm 32:7) (fully:
אַתָּ֤ה׀ ),
- Instead of ‹QAMATS, MAHAPAKH›,
CTR uses ‹QAMATS,
YETIV›.
- I.e., instead of תָּ+תּ֤, CTR uses
תָּ+תּ֚. (The plus sign expression indicates a kind of concatenation, and is
meant to be read right to left.)
- This is very unlikely to have the desired appearance in most fonts. Plus, in normalizing
contexts like most web browsers, the font won’t even get a chance to try. In this document’s
context, it will look like this: אתָּ֚ה.
As noted above, CTR’s strange vowel-relative distinctions apply
not only to the below-vowels but also to the one above-vowel,
HOLAM.
- The TIPEHA code point before
HOLAM means deḥi, e.g. (rendering
CTR’s contents in this document’s context)
אֹ֖זֶן, אֹ֖מֶר, and כָּל־רֹ֖אַי
(Psalm 18:45, 19:4, and 22:8). But, consistent with the general sloppiness of
CTR, sometimes TIPEHA appears after
HOLAM, even when a deḥi is (or
should be) intended, e.g. (rendering CTR’s contents in this
document’s context) in בֹּ֖קֶר and עֹ֖ז (Psalm 5:4 and
22:11). Note that TIPEHA before
HOLAM is somewhat analogous to
GERESH before a below-vowel. In both cases, the logical order does
not reflect a desired horizontal visual order, since in each case, one of the marks is a below-mark
and the other is an above-mark. Rather, the logical order reflects at most a desired horizontal
visual alignment (right-biased rather than centered) of the accent relative to a vowel-free area of
its letter. (That area being the letter’s top for GERESH and bottom
for TIPEHA). Because this is completely nonstandard, the desired
visual alignment is very unlikely to be achieved in most or all fonts.
- The GERESH code point before
HOLAM means geresh muqdam, e.g.
(rendering CTR’s contents in this document’s context)
מִכָּל־רֹ֜דְפַ֗י, פֹּ֜רֵ֗ק, and
כָּל־אֹ֜יְבָ֗יו (Psalm 7:2, 7:3, and 18:1). Note that
GERESH before HOLAM, like
GERESH before a below-vowel, does not reflect a desired distinction
in horizontal visual order, since both geresh and
geresh muqdam should, visually, appear before
ḥolam. Rather, the logical order reflects at most a desired
distinction in horizontal visual alignment (right-biased rather than centered) of the accent
relative to its letter. Or, if you like, you can think of GERESH
logically before HOLAM as meaning a
geresh visually far before
ḥolam, as opposed to GERESH
logically after HOLAM, which means a
geresh still visually before ḥolam,
but not so far before it.
- The YETIV code point before
HOLAM means yetiv, e.g.
כֹּ֚ל (Joshua 1:4) and YETIV after
HOLAM means mahapakh, e.g. (rendering
CTR’s contents in this document’s context)
(Psalm 6:11) Note that
YETIV before HOLAM is yet another
case where the logical order does not reflect a desired horizontal visual order, since
YETIV is a below-mark and HOLAM an
above-mark.
One might naturally wonder how, on a letter without a vowel,
CTR encodes the six accents of these three “lookalike” pairs. I.e.
one might naturally wonder how these six accents are encoded when they are “bare,” i.e. not sharing
their letter with a vowel mark.
- The TIPEHA code point is used for both bare
deḥi and bare tarḥa. This results in
an ambiguity. E.g. CTR codes the bare
deḥi in כִּי־ה֭וּא (Psalm 24:2) as
TIPEHA. In this document’s context, that
TIPEHA will look like this: ה֖וּא ,
i.e. it will look like a tarḥa. This makes it indistinguishable, for
example, from the bare tarḥa Psalm 59:5
ע֖וּרָה .
- The GERESH code point is used for both bare
geresh muqdam and bare geresh. These
accents are exclusive to the poetic and prose systems respectively so even when these accents are
bare, there is no ambiguity (assuming we know what accent system the word belongs to). As always, it
is important to be aware that though Job is, for the most part, a poetically-accented book, its
introduction and conclusion are prose-accented. So, a bare geresh in
Job could be either a geresh muqdam or
geresh, depending on whether or not its verse is in the range 3:2 to
42:6 (inclusive).
- The YETIV code point is used for both bare
yetiv and bare mahapakh.
Yetiv is exclusive to the prose system so there is no ambiguity if we
know that the word belongs to the poetic system. If the word belongs to the prose system, then
YETIV is ambiguous.
- With no pattern I can discern, sometimes MAHAPAKH is used for a
bare mahapakh, as in שִׂמְח֤וּ (Psalm
32:11).
In conclusion, CTR uses and abuses Unicode in strange ways that
in most environments (font, browser, etc.) will not have the desired effect.