You have a Perl-based website and it’s time to migrate it from a Latin based encoding to UTF-8. Perl has many pieces to the encoding puzzle and a road map is useful here. Hopefully, at the end of this talk you will understand the basics of converting your data to UTF-8, ensuring that your website outputs UTF-8 correctly and how to debug any encoding issues that might crop up.
THE COMMON ENCODING TYPES
Brief overview of the Latin-1 (ISO-8859-1) and Windows-1252 encodings.
UTF-8: A BRAVE NEW WORLD
Brief overview of the UTF-8 encoding standard with regard to the 1, 2, 3 and 4 byte encodings and how the bits are encoded.
Perl and UTF-8
How to do the following in Perl:
Some tips on how to debug some common encoding issues.
We notice that it is not easy to navigate the transition from traditional encodings to UTF-8 but with perseverance it is doable. We have illustrated the common encodings, how to process our information in this environment and how to tackle any issues that might arise.
Currently a Perl hacker for Xerox.com though in a previous life he worked for the military industrial complex.
Comments on this page are now closed.
For information on exhibition and sponsorship opportunities at the conference, contact Sharon Cordesse at firstname.lastname@example.org
Download the OSCON Sponsor/Exhibitor Prospectus
For media-related inquiries, contact Maureen Jennings at email@example.com
To stay abreast of conference news and to receive email notification when registration opens, please sign up for the OSCON newsletter (login required)
View a complete list of OSCON contacts