The problem with the Donaudampfschiffahrtsgesellschaftskapitän
Especially in the German-speaking world there are very long compound words and nouns that are made up of several other words (also called noun compound). This post was mostly automatically translated from German.
Some more abstract examples:
And some more common:
Often funny examples are used here, which are rarely needed. But in the editorial daily routine of website maintenance, there are always problems with the length of words.
Since in the layout of web pages, often banners or other design elements are used, in which the horizontal width of texts is limited by the design, many German words are necessarily wrapped.
Browsers are currently hardly any help with hyphenation
The browser can theoretically do this itself if this has been defined in the stylesheets for the layout. However, no browser is particularly good at doing this for all languages, and the implementation for the same browser also varies from operating system to operating system. This means that as an editor, you cannot rely on the text you enter to wrap for the website visitors in the way it is intended, or in the way it complies with the rules of the German language. This is especially true for responsive layouts where it is completely unclear how the visitor views the website.
In English, wrapping works relatively well, although such long word combinations do not usually occur there either.
So in German this leads to the fact that the browser often does not break at syllables, but wherever it wants. And a hyphen is not necessarily displayed to signal the break to the reader. This leads to a text that is much harder to read and as a website visitor you are worried about the German language skills of the respective editorial staff.
The reliable but manual solution to this problem is to insert invisible separators, which serve as a signal to the browser at which point a wrap is intended.
In German this special character is called Weiches Trennzeichen. More common in the web environment is the English term Soft Hyphen. This is also where the combination of characters "­" comes from, which marks the separator character in HTML. In Unicode it is the combination \u00AD. See Wikipedia.
Since these characters cannot be inserted very comfortably by hand into common text editors of content management systems and are also invisible, a different solution is required. There are already individual plugins for Neos CMS, TYPO3 and other systems, which solve this by e.g. the editorial staff using different characters, which are then replaced by a soft hyphen when the page is rendered. A variant, which I have seen here several times, was the use of two vertical pipes „||“.
However, this leads to several problems:
- The text has interfering characters that have no place there.
- The editors must check text in the frontend, or in a preview mode.
- The output of the web page is minimally slower, because all text must be searched for these characters.
- The selected character combination is never output as itself, even if this is desired.
- If the texts containing these characters are output in other channels or via interfaces, it must be ensured that the separators are also inserted.
- The spelling correction of the operating system or browser no longer works.
Especially the last point can cause problems with the inner workings of content management systems and with additional plugins (SEO, sitemaps, tables of contents, ...). So the better solution would be to use the real separator, but make it visible during editing.
Another variant for automatic hyphenation
The solution for Neos CMS
Entering real separators without disadvantages is now possible with my new extension for Neos CMS Shel.Neos.Hyphens.
This extends the CKEditor5 text editor used in Neos by a plug-in that allows soft hyphens to be inserted at any position in the text. Since Neos allows direct editing in the text of the web page, you can immediately see how the text behaves. You don't even have to publish and view the website in the frontend.
The soft separator is represented here by a slightly colored hyphen. This is only visible in the backend. Internally, the real separator is stored in the database in its Unicode representation. This solves all the problems mentioned above and even works with the spellchecker of the operating system or browser.
Just try my extension in your Neos CMS website. Of course this function works in all languages and not only in German. I am happy about feedback and ideas for improvement.