It is currently Wed Aug 23, 2017 7:01 am

All times are UTC - 5 hours




 Page 1 of 1 [ 3 posts ] 
Author Message
 Post subject: Text Encoding UTF-8 vs UTF-16, etc and C++ Impact
PostPosted: Tue Jun 16, 2015 11:33 pm 

Joined: Mon Mar 24, 2014 2:25 pm
Posts: 69
Marek,

While building up an engine as a hobby while following along with your Shader VMKs, I've been utilizing wstring through the code to support what I thought was better Unicode cross-compatibility. Turns out the more I read the more I believe I need to utilize std::string throughout where it should be assumed content generated and consumed is UTF-8, the preferred Unicode encoding at this time.

UTF-8 w/o BOM seems to be the most stable and sensible step forward, but it's actual implementation in C++ and cross-compatibility with Windows, Mac, Linux etc. is proving a challenge for me to get a good grasp on and implement.

Do you know if your current engine implementation as written would allow UTF-8 international characters for filenames, etc? My heavily modified version does not properly accept these characters and build across the 3 platforms currently. I think the best step would be clone my current source so far and go through the tough process of converting everything to std::string and only widen as necessary to interact with the specific pre-built calls, but it's going to take awhile.

I've been studying a few links referenced here:
http://utf8everywhere.org/
http://utfcpp.sourceforge.net/
http://www.nubaria.com/en/blog/?p=289

Before I go through this re-write, I know your input and thoughts on the topic would be greatly appreciated whether through a new VMK or forum discussion if your time permits.

Thanks in advance,
Scott (Mebourne)


Offline
 Profile  
 
 Post subject: Re: Text Encoding UTF-8 vs UTF-16, etc and C++ Impact
PostPosted: Wed Jun 17, 2015 7:58 am 
Site Admin

Joined: Sun Feb 11, 2007 8:59 am
Posts: 1105
Location: Ontario Canada
Mebourne wrote:
I think the best step would be clone my current source so far and go through the tough process of converting everything to std::string and only widen as necessary to interact with the specific pre-built calls


That is exactly what you will need to do if you want to support UTF-8. It is a lot of work but something that needs to get done if you intend to distribute across different languages.


Offline
 Profile  
 
 Post subject: Re: Text Encoding UTF-8 vs UTF-16, etc and C++ Impact
PostPosted: Wed Jun 17, 2015 12:53 pm 

Joined: Mon Mar 24, 2014 2:25 pm
Posts: 69
Thanks for the follow-up Marek. While I suspect this engine will never leave my own hard drive, my intent is to learn the best practices along with this hobby development. I know you were utilizing std::string and I should have followed along with that.

Std::string usage with the assumption of UTF-8 instead of std::wstring seems like the most preferred cross-platform localization method out there. It will be interesting to see how the code will need to change though to detect some of the non-fixed width characters that could be present.

I'll be testing over the next week or so, but any insight you have into your best practices to handle this would be appreciated.


Offline
 Profile  
 
Display posts from previous:  Sort by  
 Page 1 of 1 [ 3 posts ] 

All times are UTC - 5 hours


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Jump to:  

cron