Data Compression Pointers
This page is partly written in Japanese.
日本語と英語のちゃんぽんです。
News
[2005-03] Some Recent News:
[2003-07]
LZW Patent and Software Information:
The U.S. LZW patent expires June 20, 2003, the counterpart Canadian patent expires July 7, 2004, the counterpart patents in the United Kingdom, France, Germany and Italy expire June 18, 2004, and the Japanese counterpart patents expire June 20, 2004.
[2003-02-2X]
buffer overrun in zlib 1.1.4
もう LZW 米国特許は expire したのか?
(Re: LZW Patent Expiry)
[2003年6月まで有効というのが正しいようです]
日本特許 2123602, 2610084 は 1984-06-20 出願だから 2004-06-19 まで有効?
[2003-01-30] bwtzip:
A Linear-Time Portable Research-Grade Universal Data Compressor (by Stephan T. Lavavej)
[2002-09-30] UnZip versions < 5.50
security vulnerability / 5.50 DOS version textmode corruption bug
[2002-07-20] special page on JPEG Patent
[2002-04-18] JPEG-LS dll and plug-in available at HP Labs.
[2002-04-15] Mark Nelson's Data Compression Library renamed:
DataCompression.info
[2002-04-07] ZeoSync Patent: Relational Differentiation Encoding (local mirror: zeosync.pdf) (See also my ZeoSync page in Japanese)
[2002-03-29] Open-source ARJ started
(Project Info)
[2002-03-13] zlib Compression Library Corrupts malloc Data Structures via Double Free
Some of my writings
私の圧縮関係の雑文がここにあります。
Among my writings on data compression,
the only one written in English is
History of Data Compression in Japan,
which is rather outdated.
日本語の本
- 植松友彦『文書データ圧縮アルゴリズム入門』
CQ出版社,1994年,2718円,ISBN 4-7898-3672-X
- M. ネルソン, J.-L. ゲィリー(荻原剛志・山口英 訳)『データ圧縮ハンドブック』
トッパン,1996年,FD付,5728円,ISBN 4-8101-8605-9
- 情報理論とその応用学会 編『情報源符号化――無歪みデータ圧縮』
培風館,1998年,3200円,ISBN4-563-01450-8
- 韓 太舜・小林欣吾『情報と符号化の数理』
培風館,1999年,4500円,ISBN4-563-00599-1
- 橋本 猛『情報理論』
培風館,1997年,2800円,ISBN4-563-01398-6
- 平澤茂一『情報理論』
培風館,1996年,2900円,ISBN4-563-01491-5
- 藤原 洋 監修『画像&音声圧縮技術のすべて』
インターフェース増刊,CQ出版社,2000年4月,2200円
- 貴家仁志・村松正吾『マルチメディア技術の基礎 DCT入門』
CQ出版社,1997年,2500円,ISBN4-7898-3679-7
Introductory books in English
- Ian H. Witten,
Alistair Moffat,
and Timothy C. Bell,
Managing Gigabytes: Compressing and Indexing Documents and Images, 2nd ed.
(Morgan Kaufmann, 1999, ISBN 1-55860-570-3)
- Khalid Sayood,
Introduction to Data Compression, 2nd ed.
(Morgan Kaufmann Publishers, 2000)
- David Salomon (author of
The Advanced TeXbook),
Data Compression: The Complete Reference, 2nd edition
(Springer, 2000, ISBN0-387-95045-1)
- Mark Nelson and Jean-loup Gailly,
The Data Compression Book, 2nd ed.
(M&T Books, 1996, ISBN:1-55851-434-1)
- Darrel Hankerson, Greg A. Harris, and Peter D. Johnson Jr.,
Introduction to Information Theory and Data Compression
(CRC Press, ISBN:0-8493-3985-5, 1997).
- Jerry Gibson, Toby Berger,
Tom Lookabaugh, Rich Baker and David Lindbergh,
Digital Compression for Multimedia: Principles & Standards
(Morgan Kaufmann, 1998, ISBN 1-55860-369-7)
Newsgroups
comp.compression FAQ is archived at:
Links to Links
Mark Nelson's Data Compression Library
(old / new)
also has a lot of links.
Test Data Sets (テストデータ)
昔は Calgary Corpus がよく使われましたが,
より新しい Canterbury Corpus のほうがお薦めです。
Info-ZIP, gzip, zlib, ...
LHA
Haruyasu Yoshizaki (Yoshi)'s main email address is/was
SDI00506@nifty.ne.jp,
but I've been unable to get in touch with him lately.
Compuression using Burrows-Wheeler Transform
Burrows-Wheeler 変換(ブロック整列法)は今までにない方法です。
Burrows 氏に聞いたところ,
特許を取るつもりはないので自由に使ってくれということです。
日本でもいくつか実験的な実装が作られましたが,
今のところ最も有名な実装は bzip2 です。
bzip2 は gzip ほどではありませんがなかなか高速で,
ある程度以上大きいファイルでは gzip より高圧縮です。
bzip2 の作る圧縮ファイルの標準の拡張子は .bz2 です。
Linux の一部の配布でも gzip と併用して bzip2 が使われるようになりました。
なお,旧版 bzip は算術符号化を使っていたので特許の問題があります。
szip も同様なソフトです。
Schindler はかつては CERN
に出向いて高エネルギー物理実験データの圧縮などを手掛けていた人で,
たしか私もメールをもらったことがあるように思います。
今はソフト屋になったようです。
Range Coder
Almost as good as arithmetic coder, and patent free!
JPEG
JPEG-LS
JPEG-LS, a proposed JPEG lossless/near-lossless image compression mode,
is faster and in many cases compresses tighter than zlib-based PNG.
Its core algorithm is LOCO-I.
``As part of the JPEG-LS work, HP and Mitsubishi have graciously
agreed that the patents needed for implementation of the standard may
be used without payment of license or royalty fees, and IBM have
offered their QM coder patents on a similar basis for JPEG and JBIG
work'' -- JPEG.ORG,
but you must fill out this form.
(US Patent No. 5,680,129, "System and Method for Lossless Image
Compression"; US Patent No. 5,764,374, "System and Method for Lossless
Image Compression Having Improved Sequential Determination of Golomb
Parameter")
- HP Labs LOCO-I/JPEG-LS Home Page
- Marcelo J. Weinberger, Gadiel Seroussi, and Guillermo Sapiro.
LOCO-I: A Low Complexity, Context-Based,
Lossless Image Compression Algorithm.
Proceedings of the Data Compression Conference (DCC '96),
March 1996, Snowbird, Utah.
(IEEE Computer Society Press, 1996, ISBN 0-8186-7358-3)
pp. 140--149. [preprint available from the above site]
- Gadiel Seroussi and Marcelo J. Weinberger.
On Adaptive Strategies for an Extended Family
of Golomb-type Codes.
Proceedings of the Data Compression Conference (DCC '97),
March 1997, Snowbird, Utah.
(IEEE Computer Society Press, 1997, ISBN 0-8186-7761-9)
pp. 131--140.
- JPEG-LS Software (broken)
- SPMG
(Signal Processing & Multimedia Group,
University of British Columbia)
[Software -- JPEG-LS]
- Memory Efficient Scalable Line-based Image Coding
(HPL-1999-1)
- From LOCO-I to the JPEG-LS Standard
(HPL-1999-3)
JPEG2000
MPEG
Lossless Video
JBIG
GIF, LZW, Unisys
Wavelet
See also Astronomy/FITS below.
Astronomy and FITS
FITS = Flexible Image Transport System
Seismic Data
Other Formats
Patents
FLDC (Fujitsu Lossless Data Compression)
3D Compression
Other (Japan)
Other (Overseas)
- Charles Bloom's Page
- FLId Lib
- Mark Nelson:
DataCompression.info
- Archive Comparison Test (ACT) by Jeff Gilchrist
- UltraCompressor
- University of Washington Data Compression Laboratory
- DCP Research
- Telenor's H.263 Software,
ftp://bonde.nta.no/pub/tmn/software
- H.264 draft,
Tutorial
- X1 archiver (ftp)
- What is Binhex and Where Can I Get It?
- The Digital Video Broadcasting Project
- Hamarsoft HAP Archiver
- TinyHAP (ftp)
- The Vision Group at NASA Ames Research Center (look under 'Publications')
- IEEE Communications Society (text version)
- PKWARE
- Data compression for KLOE
- Provine's Data Compression Page
- Mass Spectrometry Data Compression Standard
- Data compression on zero suppressed High Energy Physics Data
- Data Compression for Real-Time Control
- Huffman Coding of ACIS Pixel Data
(ACIS = AXAF CCD Imaging Spectrometer,
AXAF = Advanced X-ray Astrophysics Facility)
- Data Compression
(Lelewer and Hirschberg; recommended reading for beginners)
- LZO
(Markus Franz Xaver Johannes Oberhumer
という長い名前の人の作ったとても高速なルーチン)
- Introduction to Data Compression
- Data Compression Techniques
- Compaction Technologies
(Java による圧縮ソフト)
- 科学衛星「ようこう」のデータ圧縮
(Yohkoh Data Archive Center,
Yohkoh Analysis Guide)
- Rockwall Voice ADPCM Specification and Source Code
- Pbm2cps
- Software Area
by Dennis Lee (WA: Waveform Archiver)
- Tim Bell,
Alistair Moffat,
Ian Witten
- Dr. A.J. Robinson (Tony Robinson)
(see Speech and Audio Coding -- Shorten)
(He moved to SoftSound)
- Jason Jordan's shorten 3.x for Unix
- The Lossless Compression (Squeeze) Page
- BMZ
(looks like a simply gzip'ed BMP format)
- Data Compression Conference (スキーができて楽しいらしい。そのうち行ってみたいな〜)
- Daniel J. Bernstein:
yabba
- Real-Time Lossless Compression Systems
(hardware compression, X-Match, main-memory compression)
- Signal Processing & Coding Laboratory (SPACL)
- Universal Source Encoding for Science Data (USES)
`szip' (Lossless signal compression, Rice code?
Not to be confused with Michael Schindler's)
Binaries for Win32, Solaris, etc. available. No source code.
- Compressed VRML: Specification
- Alba's Publications
(3-D data compression applied to N.M.R. tomography medical images
and multispectral LandSat, AVIRIS & SAR satellite images through
pyramidal techniques)
- The Lenna Story
- Multimedia Systems Design
(Online Magazine)
- Real Time Speech Compression Software (Linux Phone Project)
- A sample chapter from: The Scientist and Engineer's Guide to Digital Signal Processing by Steven W. Smith
- An Optimizing Hybrid LZ77 RLE Data Compression Program, aka Improving Compression Ratio for Low-Resource Decompression
- Standard ECMA-222: Adaptive Lossless Data Compression Algorithm
- Bernie's TMW0.51 Page
- Dr Ross's Compression Crypt
- Quantization and Data Compression (Stanford EE372)
- GFF Home Page
(Encyclopedia of Graphics File Formats)
- Remote Communications Inc.
(HyperSpace: Compressed HTML)
- Morgan Multimedia
(M-JPEG software codec)
- John McGowan's AVI Overview
- Digital Video (Colin Manning)
- Brodnik and
Carlsson:
Sub-linear Decoding of Huffman Codes Almost In-Place
- Data Compression Reference Center (not much info yet)
- DAKS, LLC Home Page
(lossless compression of integer data; US patent 5,825,830)
- Pegasus imaging:
Lossless Image Compression (ELS Arithmetic Coder) /
Sound Compression from an Imaging Company
- SPIHT Image Compression
(Image Compression with Set Partitioning in Hierarchical Trees)
...seems to have crashed. See also
Center for Image Processing Research
- Ra2Wav decoder
- Split2000
(lossless audio compressor)
- TIFF: ftp://ftp.sgi.com/graphics/tiff/ (libtiff: tiff-v3.4beta037.tar.gz)
- DjVu
(some of the source code is now
available)
- LADsoft Real Audio Page
- Microsoft's CAB page (LZX compression)
- Yakov Nekrich's Lossless Image Compression Links
- Stuart Inglis
(PhD thesis and source code)
- SCOTT's "one to one" compression discussion
- USING AN OBJECT DATABASE AND MASS STORAGE SYSTEM FOR PHYSICS ANALYSIS (CERN, 1997)
(briefly describes NTFS compression efficiency for physics data)
- Cora
(see the Compression section)
- What you really need to know about Desktop Video Conferencing Systems
- Death of David Huffman, 10-11-99
- PKZip Creatr Dies
(ABCNEWS.com)
- Lossless Compression of Digital Audio
(HPL-1999-144)
- Shannon:
A Mathematical Theory of Communication
- Lossless Transform Audio Compression (old) /
LPAC - Lossless Predictive Audio Compression
- A Nearest-Neighbor Based Predictive Lossless Image Coder
- Lossless audio compression
- Binary Tree Predictive Coding
- Mat Hans:
Optimization of Digital Audio for Internet Transmission
- VoiceAge (ACELP)
- Lossless Compression of High-volume Numerical Data from Simulations (Vadim Engelson, Dag Fritzson, Peter Fritzson)
- The Computer Journal - Volume 40 Issue 2/3 (Lossless Compression)
- Handout (Glen Langdon)
- shcodec (static Huffman codec)
- MUSICompress (US Patent #5,839,100)
- UPX: the Ultimate Packer for eXecutables
- Arturo Campos home page (lots of compression stuff)
- FLAC (Free Lossless Audio Codec)
- lzip (2000年4月1日公開。ダウンロードして lzip
を画像ビューアで開いてください :)
- Monkey's Audio - a fast and powerful lossless audio compressor
- MPEG-4 Audio Lossless Coding (ALS)
- Meridian -- Library Papers --
J. R. Stewart et al.,
MLP Lossless Compression.
Compression/Decompression Tools on Windows
- Susieの部屋
(竹村 嘉人 氏)
-- Lhasa (lzh, zip 伸長のみの簡単ツール)
- +Lhaca (村山 富男 氏)
lzh, zip 圧縮伸長
- とりあえずホームページ
(白川 泰洋 氏)
解凍レンジ: lzh, zip, cab, arj, (tar), gz, Z, bz2
Easy圧縮: lzh, zip, cab
- Aladdin Systems --
Aladdin Expander for Windows は StuffIt 対応のフリーなエクスパンダ
- やまざき@BinaryTechnologyのページ -- 独自のYZ形式 (.yz1) の圧縮ツール
- とってもごはん (鶴田真一さん) --
GCA (G Compression Archiver = Block Sorting + Range Coder)
奥村晴彦 (Haruhiko Okumura)
Last modified: 2005-03-21 11:52:32