Opened 5 years ago
Last modified 4 years ago
#239 testing defect
Fix all Smalltak/X source files to be in unicode (UTF8)
Reported by: | Patrik Svestka | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | default | Keywords: | |
Cc: | Also affects CVS HEAD (eXept version): | no |
Description
It appears there are some encoding issues with the Smalltalk/X source files. All the files that are stored at mercurial should be converted to unicode version. The files in CVS are not to be adjusted.
Attachments (7)
Change History (15)
comment:1 Changed 5 years ago by
Changed 5 years ago by
Attachment: | fixing_encoding_and_adding_encoding_directive.ps1 added |
---|
powershell script that fixes all the encoding in all files
Changed 5 years ago by
Attachment: | goodies_unicode_conversion.log added |
---|
log file form 'goodies' directory
Changed 5 years ago by
Attachment: | stx_unicode_conversion.log added |
---|
log file from 'stx' directory (except for 'goodies' directory)
comment:2 Changed 5 years ago by
Sigh, the re-encoding script is not working properly for all cases. The powershel Get-Content detects encoding automatically, but incorrectly if the source encoding is UTF8 without BOM.
If load such file and try to save it the encoding characters are damaged. I'll try to figure out a solution for that.
comment:3 Changed 5 years ago by
I have found a solution. I have taken the orinal files and some files with force -UTF8
reading encoding and compred them manually via diff utility.
I have addressed all issues and tried to run the tests and everything that passed before passes now too.
Please see all the patches at patches_for_[#239]_Encoding.7z
comment:4 Changed 5 years ago by
Status: | new → testing |
---|
Changed 5 years ago by
Attachment: | patches_for_[#239]_Encoding.7z added |
---|
Encoding patches for Mercurial based repositories
comment:5 Changed 5 years ago by
Based on our discussion I have made changes to the patches above (via the encoding.st
script (executed via smalltalk.bat --execute c:\<path>\stx\encoding.st
).
These changes are mainly:
- only files that contain an above
ascii
character have the header started with:"{ Encoding: utf8 }"
- all files now have Unix EOL (String lf) - some files did have mixed encoding (
String crlf
andString lf
) some even had mac encoding (String cr
).
I'm attaching new patches - I have done rebase to those patches that I've already received.
Changed 5 years ago by
Attachment: | encoding.st added |
---|
Smalltalk script used for repatching (some minor manual intervetion was still needed)
comment:6 Changed 5 years ago by
Sigh, I have made silly mistake with mercurial rebase. I'm republishing the file unicode_re-patches_20181115.7z
which should have it fixed.
comment:7 Changed 5 years ago by
You may notice that some of the files appear not to have any changes. These files had issues with the line-ends. There were cases were the ends were only CR, also CRLF or mixed.
I have unified all EOL
to LF
only.
comment:8 Changed 4 years ago by
Jan, probably the patches in the goodies directory are missing from the patches you have applied.
I have created a powershell script which does all the conversion automagically :). I'm adding it to the ticket.
It has to be run twice. Once in
C:\prg_sdk\stx8-jv_swing\build\stx
and second inC:\prg_sdk\stx8-jv_swing\build\stx\goodies
.