Opened 6 years ago

Last modified 5 years ago

#239 testing defect

Fix all Smalltak/X source files to be in unicode (UTF8)

Reported by: patrik.svestka@… Owned by:
Priority: major Milestone:
Component: default Keywords:
Cc: Also affects CVS HEAD (eXept version): no

Description

It appears there are some encoding issues with the Smalltalk/X source files. All the files that are stored at mercurial should be converted to unicode version. The files in CVS are not to be adjusted.

Attachments (7)

fixing_encoding_and_adding_encoding_directive.ps1 (3.4 KB ) - added by patrik.svestka@… 6 years ago.
powershell script that fixes all the encoding in all files
goodies_unicode_conversion.log (291.5 KB ) - added by patrik.svestka@… 6 years ago.
log file form 'goodies' directory
stx_unicode_conversion.log (224.1 KB ) - added by patrik.svestka@… 6 years ago.
log file from 'stx' directory (except for 'goodies' directory)
patches_for_[#239]_Encoding.7z (103.8 KB ) - added by patrik.svestka@… 6 years ago.
Encoding patches for Mercurial based repositories
unicode_patches_20181115.7z (67.8 KB ) - added by patrik.svestka@… 5 years ago.
Re-patch of all files
encoding.st (4.2 KB ) - added by patrik.svestka@… 5 years ago.
Smalltalk script used for repatching (some minor manual intervetion was still needed)
unicode_re-patches_20181115.7z (83.1 KB ) - added by patrik.svestka@… 5 years ago.
Re-patch of all files

Download all attachments as: .zip

Change History (15)

comment:1 by patrik.svestka@…, 6 years ago

I have created a powershell script which does all the conversion automagically :). I'm adding it to the ticket.

It has to be run twice. Once in C:\prg_sdk\stx8-jv_swing\build\stx and second in C:\prg_sdk\stx8-jv_swing\build\stx\goodies.

by patrik.svestka@…, 6 years ago

powershell script that fixes all the encoding in all files

by patrik.svestka@…, 6 years ago

log file form 'goodies' directory

by patrik.svestka@…, 6 years ago

Attachment: stx_unicode_conversion.log added

log file from 'stx' directory (except for 'goodies' directory)

comment:2 by patrik.svestka@…, 6 years ago

Sigh, the re-encoding script is not working properly for all cases. The powershel Get-Content detects encoding automatically, but incorrectly if the source encoding is UTF8 without BOM.

If load such file and try to save it the encoding characters are damaged. I'll try to figure out a solution for that.

comment:3 by patrik.svestka@…, 6 years ago

I have found a solution. I have taken the orinal files and some files with force -UTF8 reading encoding and compred them manually via diff utility.

I have addressed all issues and tried to run the tests and everything that passed before passes now too.

Please see all the patches at patches_for_[#239]_Encoding.7z

comment:4 by patrik.svestka@…, 6 years ago

Status: newtesting

by patrik.svestka@…, 6 years ago

Encoding patches for Mercurial based repositories

comment:5 by patrik.svestka@…, 5 years ago

Based on our discussion I have made changes to the patches above (via the encoding.st script (executed via smalltalk.bat --execute c:\<path>\stx\encoding.st).

These changes are mainly:

  • only files that contain an above ascii character have the header started with: "{ Encoding: utf8 }"
  • all files now have Unix EOL (String lf) - some files did have mixed encoding (String crlf and String lf) some even had mac encoding (String cr).

I'm attaching new patches - I have done rebase to those patches that I've already received.

by patrik.svestka@…, 5 years ago

Attachment: unicode_patches_20181115.7z added

Re-patch of all files

by patrik.svestka@…, 5 years ago

Attachment: encoding.st added

Smalltalk script used for repatching (some minor manual intervetion was still needed)

comment:6 by patrik.svestka@…, 5 years ago

Sigh, I have made silly mistake with mercurial rebase. I'm republishing the file unicode_re-patches_20181115.7z which should have it fixed.

by patrik.svestka@…, 5 years ago

Re-patch of all files

comment:7 by patrik.svestka@…, 5 years ago

You may notice that some of the files appear not to have any changes. These files had issues with the line-ends. There were cases were the ends were only CR, also CRLF or mixed.

I have unified all EOL to LF only.

Last edited 5 years ago by patrik.svestka@… (previous) (diff)

comment:8 by patrik.svestka@…, 5 years ago

Jan, probably the patches in the goodies directory are missing from the patches you have applied.

Note: See TracTickets for help on using tickets.