Searching for the lost X

Started by vampirefrog, December 05, 2017, 01:49:08 AM

Previous topic - Next topic

vampirefrog

Hello. I've noticed some MDX files use special characters, such as the X68000 "X" character. Example: MXDRV/Anime/NONTAN.MDX. The SJIS code is eb 9f. It seems to be out of range of normal SJIS. I am trying to locate the character data, it doesn't seem to be in cgrom.dat, or perhaps I am not looking in the right spot? Does anyone know where it is? Thanks!

I've also extracted the characters in cgrom and mapped them to sjis, you can view them on this page, next to a normal serif font: http://vgmrips.net/sjis/ CAUTION! over 7500 png images


vampirefrog

Thanks, it seems that if I rename USKCG.SYS to something else, the "X" symbol does not appear anymore in MMDSP.

But this is all I found in USKCG.SYS (see attachments), and the picture you showed me contains an "X" symbol made from two characters, which I did not find.

Do you know if there is documentation for USKCG.SYS and X68000 Gaiji? Thanks!

takka

Sorry.
"http://kha-in.info/retro-computer/assets/20150626x68-wordprocessor/img/x68-wordprocessor-char-list.png" is WP.x original symbol.

'uskcgm.x' is EDIT/LIST USKCG.SYS.

C>uskcgm.x
更新ですか, 登録ですか? [U/L] <- input 'U'
入力ファイル名 [USKCG.SYS]: <- input 'Enter'
出力ファイル名 [USKCG.SYS]: <- input 'Enter'

screen changes.Press 'F3' : View all symbol.

Documentation for USKCG.SYS
http://www.geocities.co.jp/SiliconValley3115/Human68k/uskcgm.html



vampirefrog

Thank you, it seems they are the same characters, with the exception that in USKCG.SYS, there are some hidden ones that you can see in my black on white images above.

Now to map these characters to Shift_JIS. I believe the code above each is the JIS code. I know that the special "X" is represented in Shift_JIS as 0xeb 0x9f, and in uskcgm it is marked as 7621. So let's see if the formula on the Wikipedia page applies. Decimal values for the JIS code are 118, 33. 95 <= 118 <= 126, therefore s1 = (118 + 1) / 2 + 176. 118 is even, therefore s2 = 33 + 126; So s1,s2 = 235,159 = eb, 9f. It works out perfectly! Hooray!

So if I just create a file with these Shift_JIS values, from 0xeb 0x80 to 0xeb 0xff, I get 82 characters, which you can see in the last attachment, and they include the hidden characters that USKCGM doesn't show.

Besides this range, there are also some other nonstandard Shift_JIS values, for 8x8 characters, half-width katakana and some other stuff.

vampirefrog

I have written a C program that outputs all the SJIS ranges. I've attached the C file and the generated output which you can scroll through with ed.

I have replaced some bytes with their hex value, because they were problematic: 0x00, 0x09 (tab), 0x0a, 0x0d, 0x1a (aparently this is considered EOF).

You'll notice there are a lot of differences compared to what iconv outputs. Interesting ranges are the 0x80 half-width range, which has half-width hiragana. I haven't found unicode equivalents for those!