+ All Categories
Home > Technology > CodeCamp Iasi 10 march 2012 - UTF-8

CodeCamp Iasi 10 march 2012 - UTF-8

Date post: 25-Dec-2014
Category:
Upload: codecampiasi
View: 466 times
Download: 6 times
Share this document with a friend
Description:
 
29
marți, 13 martie 12
Transcript
Page 1: CodeCamp Iasi 10 march 2012 - UTF-8

marți, 13 martie 12

Page 2: CodeCamp Iasi 10 march 2012 - UTF-8

UTF-8The What, Why, and How

Iulian Dogariu10.03.2012

marți, 13 martie 12

Page 3: CodeCamp Iasi 10 march 2012 - UTF-8

Te-a°tept disearãla portiÞã

marți, 13 martie 12

Page 4: CodeCamp Iasi 10 march 2012 - UTF-8

00000000 89 50 4e 47 0d 0a 1a 0a 00 00 00 0d 49 48 44 52 |.PNG........IHDR|00000010 00 00 02 e5 00 00 00 8f 08 06 00 00 00 5d ca a5 |.............]..|00000020 eb 00 00 0a 43 69 43 43 50 49 43 43 20 70 72 6f |....CiCCPICC pro|00000030 66 69 6c 65 00 00 78 da 9d 53 77 58 93 f7 16 3e |file..x..SwX...>|00000040 df f7 65 0f 56 42 d8 f0 b1 97 6c 81 00 22 23 ac |..e.VB....l.."#.|00000050 08 c8 10 59 a2 10 92 00 61 84 10 12 40 c5 85 88 |...Y....a...@...|00000060 0a 56 14 15 11 9c 48 55 c4 82 d5 0a 48 9d 88 e2 |.V....HU....H...|00000070 a0 28 b8 67 41 8a 88 5a 8b 55 5c 38 ee 1f dc a7 |.(.gA..Z.U\8....|00000080 b5 7d 7a ef ed ed fb d7 fb bc e7 9c e7 fc ce 79 |.}z............y|00000090 cf 0f 80 11 12 26 91 e6 a2 6a 00 39 52 85 3c 3a |.....&...j.9R.<:|000000a0 d8 1f 8f 4f 48 c4 c9 bd 80 02 15 48 e0 04 20 10 |...OH......H.. .|000000b0 e6 cb c2 67 05 c5 00 00 f0 03 79 78 7e 74 b0 3f |...g......yx~t.?|

marți, 13 martie 12

Page 5: CodeCamp Iasi 10 march 2012 - UTF-8

“ASCII is allright”

41 53 43 49 49 20 69 73 20 61 6c 6c 72 69 67 68 74

marți, 13 martie 12

Page 6: CodeCamp Iasi 10 march 2012 - UTF-8

ASCII

marți, 13 martie 12

Page 7: CodeCamp Iasi 10 march 2012 - UTF-8

marți, 13 martie 12

Page 8: CodeCamp Iasi 10 march 2012 - UTF-8

marți, 13 martie 12

Page 9: CodeCamp Iasi 10 march 2012 - UTF-8

One bytenot enough

marți, 13 martie 12

Page 10: CodeCamp Iasi 10 march 2012 - UTF-8

“Unicode is KEWL”

ASCII

55 6e 69 63 6f 64 65 20 69 73 20 4b 45 57 4c 0a

marți, 13 martie 12

Page 11: CodeCamp Iasi 10 march 2012 - UTF-8

55 00 6e 00 69 00 63 00 6f 00 64 00 65 00 20 0069 00 73 00 20 00 4b 00 45 00 57 00 4c 00 0a 00

Unicode UCS-2

“Unicode is KEWL”

marți, 13 martie 12

Page 12: CodeCamp Iasi 10 march 2012 - UTF-8

Two bytesnot enough (!)

marți, 13 martie 12

Page 13: CodeCamp Iasi 10 march 2012 - UTF-8

55 00 00 00 6e 00 00 00 69 00 00 00 63 00 00 006f 00 00 00 64 00 00 00 65 00 00 00 20 00 00 0069 00 00 00 73 00 00 00 20 00 00 00 4b 00 00 0045 00 00 00 57 00 00 00 4c 00 00 00 0a 00 00 00

Unicode UCS-4

Unicode is KEWL

marți, 13 martie 12

Page 14: CodeCamp Iasi 10 march 2012 - UTF-8

55 00 6e 00 69 00 63 00 6f 00 64 00 65 00 20 0069 00 73 00 20 00 4b 00 45 00 57 00 4c 00 0a 00

“0” bytes

“Unicode is KEWL”

marți, 13 martie 12

Page 15: CodeCamp Iasi 10 march 2012 - UTF-8

... 55 f4 3a ff 6e ac 12 43 69 fa 3f 1a 63 ff ...

... 55 f4 3a ff 6e ac 12 43 69 fa 3f 1a 63 ff ...

UCS-2

marți, 13 martie 12

Page 16: CodeCamp Iasi 10 march 2012 - UTF-8

... 55 f4 3a ff 6e ac 12 43 69 fa 3f 1a 63 ff ...

This way?

... 55 f4 3a ff 6e ac 12 43 69 fa 3f 1a 63 ff ...

... or This way ?

marți, 13 martie 12

Page 17: CodeCamp Iasi 10 march 2012 - UTF-8

Little Endian

“U”

Big Endian

55 00

“U” 00 55

marți, 13 martie 12

Page 18: CodeCamp Iasi 10 march 2012 - UTF-8

1) Storage space

2) “0” bytes

3) Synchronisation

4) Endianness

marți, 13 martie 12

Page 19: CodeCamp Iasi 10 march 2012 - UTF-8

UTF-8

marți, 13 martie 12

Page 20: CodeCamp Iasi 10 march 2012 - UTF-8

Variable length encoding

61

a ж 龍

d0 b6 e9 be 8d

marți, 13 martie 12

Page 21: CodeCamp Iasi 10 march 2012 - UTF-8

00 61

a 0000 0000 0110 0001

0--- ----

Unicode number

UTF-8 encoding

marți, 13 martie 12

Page 22: CodeCamp Iasi 10 march 2012 - UTF-8

00 61

a 0000 0000 0110 0001

610110 0001

Unicode number

UTF-8 encoding

marți, 13 martie 12

Page 23: CodeCamp Iasi 10 march 2012 - UTF-8

04 36

ж 0000 0100 0011 0100

Unicode number

UTF-8 encoding

11-- ---- 10-- ----

marți, 13 martie 12

Page 24: CodeCamp Iasi 10 march 2012 - UTF-8

04 36

ж 0000 0100 0011 0100

d0 b6

Unicode number

UTF-8 encoding

1101 0000 1011 0110

marți, 13 martie 12

Page 25: CodeCamp Iasi 10 march 2012 - UTF-8

9f 8d

龍1001 1111 1000 1101

Unicode number

UTF-8 encoding

111- ---- 10-- ---- 10-- ----

marți, 13 martie 12

Page 26: CodeCamp Iasi 10 march 2012 - UTF-8

9f 8d

龍1001 1111 1000 1101

e9 be 8d

Unicode number

UTF-8 encoding

1110 1001 1011 1110 1000 1101

marți, 13 martie 12

Page 27: CodeCamp Iasi 10 march 2012 - UTF-8

Variable length encoding

61aж

d0 b6

e9 be 8d

0110 0001

1101 0000 1011 0110

1110 1001 1011 1110 1000 1101

marți, 13 martie 12

Page 28: CodeCamp Iasi 10 march 2012 - UTF-8

1) Storage space

2) “0” bytes

3) Synchronisation

4) Endianness

marți, 13 martie 12

Page 29: CodeCamp Iasi 10 march 2012 - UTF-8

Thank you!And please don’t forget the evaluation form :-)

marți, 13 martie 12


Recommended