Pre-vowel Accents in Chabad’s CTR

The Chabad website has an edition of the Hebrew Bible called The Complete Tanach with Rashi (CTR). (See my document, “On the Provenance of Chabad’s CTR.”)

The CTR edition distinguishes three pairs of accents using a nonstandard mechanism rather than using the code points dedicated to making these distinctions. On a letter with a code point for a vowel mark (including HOLAM), CTR distinguishes the following prepositives from their impositive “lookalikes” by the logical order of an accent code point relative to that vowel:

There are four levels of strangeness here.

In some but not all cases, the logical order of these code points reflects a desired horizontal visual order. Even when it does reflect a desired visual order, this visual order is very unlikely to be achieved, except in the case of YETIV before a vowel. In all other cases, few if any fonts will render the marks in the desired visual order, and in normalizing contexts like most web browsers, the font won’t even get a chance to try. In detail:

  1. The code point TIPEHA before a vowel means deḥi (!), e.g. to encode the segol and deḥi under the letter he in הֶ֭חרשתי (Psalm 32:3) (fully: הֶ֭חֱרַשְׁתִּי ),
  2. The code point TIPEHA after a vowel means tarḥa / tipeḥa, e.g. וְאֵ֖ין (Psalm 32:2).
  3. The code point GERESH before a vowel means geresh muqdam (!), e.g. to encode the tsere and geresh muqdam on the letter alef in אֵ֝ליו (Psalm 32:6) (fully: אֵ֝לָ֗יו ),
  4. The code point GERESH after a vowel means geresh, e.g. הַמַּ֜יִם (Genesis 1:9).
  5. The code point YETIV before a vowel means yetiv, e.g. to encode the ḥiriq and yetiv under the letter kaf in כִּ֚י (Joshua 2:11),
  6. The code point YETIV after a vowel means mahapakh (!), e.g. to encode the qamats and mahapakh under the letter tav in אתָּ֤ה (Psalm 32:7) (fully: אַתָּ֤ה׀ ),

As noted above, CTR’s strange vowel-relative distinctions apply not only to the below-vowels but also to the one above-vowel, HOLAM.

One might naturally wonder how, on a letter without a vowel, CTR encodes the six accents of these three “lookalike” pairs. I.e. one might naturally wonder how these six accents are encoded when they are “bare,” i.e. not sharing their letter with a vowel mark.

In conclusion, CTR uses and abuses Unicode in strange ways that in most environments (font, browser, etc.) will not have the desired effect.