'UTF-8'에 해당되는 글 2건

  1. 2008/03/25 weechat - Portable and multi-interface IRC client
  2. 2008/03/06 mb_detect_encoding 함수

perlmania 스터디 장범수님의 소개로 알게된 irc 클라이언트.
콘솔모드에서 동작이 가능하고 기본적으로 UTF-8을 지원한다.
한글 사용시 charset을 cp949로 지정하면 문제없이 출력된다(xchat도 이렇게 지정했던거 같다)
젠투 portage tree에도 올라와 있다.


# emerge -S weechat
Searching...
[ Results for search key : weechat ]
[ Applications found : 1 ]

*  net-irc/weechat
     Latest version available: 0.2.6
     Latest version installed: 0.2.6
     Size of files: 1,080 kB
     Homepage:      http://weechat.flashtux.org/
     Description:   Portable and multi-interface IRC client.
     License:       GPL-3


# emerge -pv weechat

These are the packages that would be merged, in order:

Calculating dependencies... done!
[ebuild  N    ] app-crypt/opencdk-0.6.6  USE="-doc" 471 kB
[ebuild  N    ] dev-libs/libtasn1-1.3  USE="-doc" 1,492 kB
[ebuild  N    ] net-libs/gnutls-2.2.2  USE="nls zlib -bindist -doc -guile -lzo" 4,809 kB
[ebuild  N    ] net-irc/weechat-0.2.6  USE="perl python ssl -debug -lua -ruby -spell" 1,081 kB

Total: 4 packages (4 new), Size of downloads: 7,851 kB

# emerge weechat
# weechat-curses irc://본인아이디@서버명[:port]

] /charset cp949    # 캐릭터셋을 지정한다.(기본 UTF-8)
] /join linux           # 리눅스 채널에 들어간다.
] /quit                  # 종료한다.


자세한 사항은 아래 url 참조
http://weechat.flashtux.org/doc/en/weechat.en.html

사용자 삽입 이미지

http://www.php.net/manual/en/function.mb-detect-encoding.php


데이터의 캐릭터셋을 판단하는 아주 유용한 함수이다.

string mb_detect_encoding ( string $str [, mixed $encoding_list [, bool $strict ]] )

mb_detect_encoding() detects character encoding in string str . It returns detected character encoding.

encoding_list is list of character encoding. Encoding order may be specified by array or comma separated list string.

If encoding_list is omitted, detect_order is used.

strict specifies whether to use the strict encoding detection or not. Default is FALSE.



Example#1 mb_detect_encoding() example

<?php
/* Detect character encoding with current detect_order */
echo mb_detect_encoding($str);

/* "auto" is expanded to "ASCII,JIS,UTF-8,EUC-JP,SJIS" */
echo mb_detect_encoding($str, "auto");

/* Specify encoding_list character encoding by comma separated list */
echo mb_detect_encoding($str, "JIS, eucjp-win, sjis-win");

/* Use array to specify encoding_list  */
$ary[] = "ASCII";
$ary[] = "JIS";
$ary[] = "EUC-JP";
echo
mb_detect_encoding($str, $ary);
?>


UTF-8 에 대한 좋은 문서들 링크

http://wyb330.egloos.com/3535249

http://sizuha.egloos.com/2563282

http://kldp.org/node/91358

http://blog.wystan.net/2007/08/18/bom-byte-order-mark-problem

http://b.mytears.org/2005/01/101

http://b.mytears.org/2005/03/136