From: Ben Gras Date: Tue, 13 Jul 2010 19:17:02 +0000 (+0000) Subject: libarchive port by Gautam Tirumala. X-Git-Tag: v3.1.8~245 X-Git-Url: http://zhaoyanbai.com/repos/%22http:/www.isc.org/icons/man.dnssec-coverage.html?a=commitdiff_plain;h=470ab03b866c30eb150b8020c5cf8517ee5b8c5d;p=minix.git libarchive port by Gautam Tirumala. --- diff --git a/lib/Makefile b/lib/Makefile index 58bcb2de8..640a0b934 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -1,7 +1,7 @@ .include SUBDIR= csu libc libcurses libdriver libnetdriver libend libedit libm libsys \ - libtimers libutil libbz2 libl libhgfs libz libfetch + libtimers libutil libbz2 libl libhgfs libz libfetch libarchive .if ${COMPILER_TYPE} == "ack" SUBDIR+= ack/libd ack/libe ack/libfp ack/liby diff --git a/lib/libarchive/Makefile b/lib/libarchive/Makefile new file mode 100644 index 000000000..19ac20af6 --- /dev/null +++ b/lib/libarchive/Makefile @@ -0,0 +1,69 @@ +LIB= archive +SRCS= archive_check_magic.c \ + archive_entry.c \ + archive_entry_copy_bhfi.c \ + archive_entry_copy_stat.c \ + archive_entry_link_resolver.c \ + archive_entry_stat.c \ + archive_entry_strmode.c \ + archive_entry_xattr.c \ + archive_read.c \ + archive_read_data_into_fd.c \ + archive_read_disk.c \ + archive_read_disk_entry_from_file.c \ + archive_read_disk_set_standard_lookup.c \ + archive_read_extract.c \ + archive_read_open_fd.c \ + archive_read_open_file.c \ + archive_read_open_filename.c \ + archive_read_open_memory.c \ + archive_read_support_compression_all.c \ + archive_read_support_compression_bzip2.c \ + archive_read_support_compression_compress.c \ + archive_read_support_compression_gzip.c \ + archive_read_support_compression_none.c \ + archive_read_support_compression_program.c \ + archive_read_support_compression_uu.c \ + archive_read_support_compression_xz.c \ + archive_read_support_format_all.c \ + archive_read_support_format_ar.c \ + archive_read_support_format_empty.c \ + archive_read_support_format_mtree.c \ + archive_read_support_format_raw.c \ + archive_read_support_format_tar.c \ + archive_read_support_format_xar.c \ + archive_read_support_format_zip.c \ + archive_string.c \ + archive_string_sprintf.c \ + archive_util.c \ + archive_virtual.c \ + archive_write.c \ + archive_write_disk.c \ + archive_write_disk_set_standard_lookup.c \ + archive_write_open_fd.c \ + archive_write_open_file.c \ + archive_write_open_filename.c \ + archive_write_open_memory.c \ + archive_write_set_compression_bzip2.c \ + archive_write_set_compression_compress.c \ + archive_write_set_compression_gzip.c \ + archive_write_set_compression_none.c \ + archive_write_set_compression_program.c \ + archive_write_set_compression_xz.c \ + archive_write_set_format.c \ + archive_write_set_format_ar.c \ + archive_write_set_format_by_name.c \ + archive_write_set_format_mtree.c \ + archive_write_set_format_pax.c \ + archive_write_set_format_shar.c \ + archive_write_set_format_ustar.c \ + archive_write_set_format_zip.c \ + filter_fork.c \ + minix_utils.c + +CPPFLAGS+= -DHAVE_CONFIG_H +INCSDIR= /usr/include +INCS= archive.h \ + archive_entry.h + +.include diff --git a/lib/libarchive/README b/lib/libarchive/README new file mode 100644 index 000000000..cdc8e7346 --- /dev/null +++ b/lib/libarchive/README @@ -0,0 +1,24 @@ +What's supported, what's not +---------------------------- +This port corresponds to libarchive-2.8.3. The following formats supported +by libarchive are NOT supported in the port: +1) iso9660 +2) various variants of cpio + +In addition though xz and lzma are included, due to the lack of +liblzma and xz utilities on Minix they are of not much use. Of the remaining +formats I know that tar and its variants (tar.gz, tar.bz2 etc) work. + +Notes on the port +----------------- +The cause for all changes is the fact that ACK does not have a 64 bit types. +Most of the changes are 'downsizing' of types from 64 bits to 32 bits. +Also a signed type is used for the measuring sizes so nothing > 2GB will work. + +Most of the changes are repetitive and can be classified into two types: + +1) Changing sizes/offsets/timestamps from 64bit types to size_t, ssize_t, off_t + time_t +2) Changing functions that either decode or encode sizes/offsets/timestamps + from or to archives to use 32 bit types. + diff --git a/lib/libarchive/archive.h b/lib/libarchive/archive.h new file mode 100644 index 000000000..9e585bab8 --- /dev/null +++ b/lib/libarchive/archive.h @@ -0,0 +1,766 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD: src/lib/libarchive/archive.h.in,v 1.50 2008/05/26 17:00:22 kientzle Exp $ + */ + +#ifndef ARCHIVE_H_INCLUDED +#define ARCHIVE_H_INCLUDED + +/* + * Note: archive.h is for use outside of libarchive; the configuration + * headers (config.h, archive_platform.h, etc.) are purely internal. + * Do NOT use HAVE_XXX configuration macros to control the behavior of + * this header! If you must conditionalize, use predefined compiler and/or + * platform macros. + */ +#if defined(__BORLANDC__) && __BORLANDC__ >= 0x560 +# define __LA_STDINT_H +#elif !defined(__WATCOMC__) && !defined(_MSC_VER) && !defined(__INTERIX) && !defined(__BORLANDC__) +# define __LA_STDINT_H +#endif + +#include +#include /* Linux requires this for off_t */ +#ifdef __LA_STDINT_H +# include __LA_STDINT_H /* int64_t, etc. */ +#endif +#include /* For FILE * */ + +/* Get appropriate definitions of standard POSIX-style types. */ +/* These should match the types used in 'struct stat' */ +#if defined(_WIN32) && !defined(__CYGWIN__) +#define __LA_INT64_T __int64 +# if defined(_SSIZE_T_DEFINED) +# define __LA_SSIZE_T ssize_t +# elif defined(_WIN64) +# define __LA_SSIZE_T __int64 +# else +# define __LA_SSIZE_T long +# endif +# if defined(__BORLANDC__) +# define __LA_UID_T uid_t +# define __LA_GID_T gid_t +# else +# define __LA_UID_T short +# define __LA_GID_T short +# endif +#elif defined(__minix) +#include /* ssize_t, uid_t, and gid_t */ +#define __LA_SSIZE_T ssize_t +#define __LA_UID_T uid_t +#define __LA_GID_T gid_t +#else +#include /* ssize_t, uid_t, and gid_t */ +#define __LA_INT64_T int64_t +#define __LA_SSIZE_T ssize_t +#define __LA_UID_T uid_t +#define __LA_GID_T gid_t +#endif + +/* + * On Windows, define LIBARCHIVE_STATIC if you're building or using a + * .lib. The default here assumes you're building a DLL. Only + * libarchive source should ever define __LIBARCHIVE_BUILD. + */ +#if ((defined __WIN32__) || (defined _WIN32) || defined(__CYGWIN__)) && (!defined LIBARCHIVE_STATIC) +# ifdef __LIBARCHIVE_BUILD +# ifdef __GNUC__ +# define __LA_DECL __attribute__((dllexport)) extern +# else +# define __LA_DECL __declspec(dllexport) +# endif +# else +# ifdef __GNUC__ +# define __LA_DECL __attribute__((dllimport)) extern +# else +# define __LA_DECL __declspec(dllimport) +# endif +# endif +#else +/* Static libraries or non-Windows needs no special declaration. */ +# define __LA_DECL +#endif + +#ifdef __cplusplus +extern "C" { +#endif + +/* + * The version number is provided as both a macro and a function. + * The macro identifies the installed header; the function identifies + * the library version (which may not be the same if you're using a + * dynamically-linked version of the library). Of course, if the + * header and library are very different, you should expect some + * strangeness. Don't do that. + */ + +/* + * The version number is expressed as a single integer that makes it + * easy to compare versions at build time: for version a.b.c, the + * version number is printf("%d%03d%03d",a,b,c). For example, if you + * know your application requires version 2.12.108 or later, you can + * assert that ARCHIVE_VERSION >= 2012108. + * + * This single-number format was introduced with libarchive 1.9.0 in + * the libarchive 1.x family and libarchive 2.2.4 in the libarchive + * 2.x family. The following may be useful if you really want to do + * feature detection for earlier libarchive versions (which defined + * ARCHIVE_API_VERSION and ARCHIVE_API_FEATURE instead): + * + * #ifndef ARCHIVE_VERSION_NUMBER + * #define ARCHIVE_VERSION_NUMBER \ + * (ARCHIVE_API_VERSION * 1000000 + ARCHIVE_API_FEATURE * 1000) + * #endif + */ +#define ARCHIVE_VERSION_NUMBER 2008003 +__LA_DECL int archive_version_number(void); + +/* + * Textual name/version of the library, useful for version displays. + */ +#define ARCHIVE_VERSION_STRING "libarchive 2.8.3" +__LA_DECL const char * archive_version_string(void); + +#if ARCHIVE_VERSION_NUMBER < 3000000 +/* + * Deprecated; these are older names that will be removed in favor of + * the simpler definitions above. + */ +#define ARCHIVE_VERSION_STAMP ARCHIVE_VERSION_NUMBER +__LA_DECL int archive_version_stamp(void); +#define ARCHIVE_LIBRARY_VERSION ARCHIVE_VERSION_STRING +__LA_DECL const char * archive_version(void); +#define ARCHIVE_API_VERSION (ARCHIVE_VERSION_NUMBER / 1000000) +__LA_DECL int archive_api_version(void); +#define ARCHIVE_API_FEATURE ((ARCHIVE_VERSION_NUMBER / 1000) % 1000) +__LA_DECL int archive_api_feature(void); +#endif + +#if ARCHIVE_VERSION_NUMBER < 3000000 +/* This should never have been here in the first place. */ +/* Legacy of old tar assumptions, will be removed in libarchive 3.0. */ +#define ARCHIVE_BYTES_PER_RECORD 512 +#define ARCHIVE_DEFAULT_BYTES_PER_BLOCK 10240 +#endif + +/* Declare our basic types. */ +struct archive; +struct archive_entry; + +/* + * Error codes: Use archive_errno() and archive_error_string() + * to retrieve details. Unless specified otherwise, all functions + * that return 'int' use these codes. + */ +#define ARCHIVE_EOF 1 /* Found end of archive. */ +#define ARCHIVE_OK 0 /* Operation was successful. */ +#define ARCHIVE_RETRY (-10) /* Retry might succeed. */ +#define ARCHIVE_WARN (-20) /* Partial success. */ +/* For example, if write_header "fails", then you can't push data. */ +#define ARCHIVE_FAILED (-25) /* Current operation cannot complete. */ +/* But if write_header is "fatal," then this archive is dead and useless. */ +#define ARCHIVE_FATAL (-30) /* No more operations are possible. */ + +/* + * As far as possible, archive_errno returns standard platform errno codes. + * Of course, the details vary by platform, so the actual definitions + * here are stored in "archive_platform.h". The symbols are listed here + * for reference; as a rule, clients should not need to know the exact + * platform-dependent error code. + */ +/* Unrecognized or invalid file format. */ +/* #define ARCHIVE_ERRNO_FILE_FORMAT */ +/* Illegal usage of the library. */ +/* #define ARCHIVE_ERRNO_PROGRAMMER_ERROR */ +/* Unknown or unclassified error. */ +/* #define ARCHIVE_ERRNO_MISC */ + +/* + * Callbacks are invoked to automatically read/skip/write/open/close the + * archive. You can provide your own for complex tasks (like breaking + * archives across multiple tapes) or use standard ones built into the + * library. + */ + +/* Returns pointer and size of next block of data from archive. */ +typedef __LA_SSIZE_T archive_read_callback(struct archive *, + void *_client_data, const void **_buffer); + +/* Skips at most request bytes from archive and returns the skipped amount */ +#if ARCHIVE_VERSION_NUMBER < 2000000 +/* Libarchive 1.0 used ssize_t for the return, which is only 32 bits + * on most 32-bit platforms; not large enough. */ +typedef __LA_SSIZE_T archive_skip_callback(struct archive *, + void *_client_data, size_t request); +#elif ARCHIVE_VERSION_NUMBER < 3000000 +/* Libarchive 2.0 used off_t here, but that is a bad idea on Linux and a + * few other platforms where off_t varies with build settings. */ +typedef off_t archive_skip_callback(struct archive *, + void *_client_data, off_t request); +#else +/* Libarchive 3.0 uses int64_t here, which is actually guaranteed to be + * 64 bits on every platform. */ +typedef __LA_INT64_T archive_skip_callback(struct archive *, + void *_client_data, __LA_INT64_T request); +#endif + +/* Returns size actually written, zero on EOF, -1 on error. */ +typedef __LA_SSIZE_T archive_write_callback(struct archive *, + void *_client_data, + const void *_buffer, size_t _length); + +#if ARCHIVE_VERSION_NUMBER < 3000000 +/* Open callback is actually never needed; remove it in libarchive 3.0. */ +typedef int archive_open_callback(struct archive *, void *_client_data); +#endif + +typedef int archive_close_callback(struct archive *, void *_client_data); + +/* + * Codes for archive_compression. + */ +#define ARCHIVE_COMPRESSION_NONE 0 +#define ARCHIVE_COMPRESSION_GZIP 1 +#define ARCHIVE_COMPRESSION_BZIP2 2 +#define ARCHIVE_COMPRESSION_COMPRESS 3 +#define ARCHIVE_COMPRESSION_PROGRAM 4 +#define ARCHIVE_COMPRESSION_LZMA 5 +#define ARCHIVE_COMPRESSION_XZ 6 +#define ARCHIVE_COMPRESSION_UU 7 +#define ARCHIVE_COMPRESSION_RPM 8 + +/* + * Codes returned by archive_format. + * + * Top 16 bits identifies the format family (e.g., "tar"); lower + * 16 bits indicate the variant. This is updated by read_next_header. + * Note that the lower 16 bits will often vary from entry to entry. + * In some cases, this variation occurs as libarchive learns more about + * the archive (for example, later entries might utilize extensions that + * weren't necessary earlier in the archive; in this case, libarchive + * will change the format code to indicate the extended format that + * was used). In other cases, it's because different tools have + * modified the archive and so different parts of the archive + * actually have slightly different formts. (Both tar and cpio store + * format codes in each entry, so it is quite possible for each + * entry to be in a different format.) + */ +#define ARCHIVE_FORMAT_BASE_MASK 0xff0000 +#ifndef __minix +#define ARCHIVE_FORMAT_CPIO 0x10000 +#define ARCHIVE_FORMAT_CPIO_POSIX (ARCHIVE_FORMAT_CPIO | 1) +#define ARCHIVE_FORMAT_CPIO_BIN_LE (ARCHIVE_FORMAT_CPIO | 2) +#define ARCHIVE_FORMAT_CPIO_BIN_BE (ARCHIVE_FORMAT_CPIO | 3) +#define ARCHIVE_FORMAT_CPIO_SVR4_NOCRC (ARCHIVE_FORMAT_CPIO | 4) +#define ARCHIVE_FORMAT_CPIO_SVR4_CRC (ARCHIVE_FORMAT_CPIO | 5) +#endif +#define ARCHIVE_FORMAT_SHAR 0x20000 +#define ARCHIVE_FORMAT_SHAR_BASE (ARCHIVE_FORMAT_SHAR | 1) +#define ARCHIVE_FORMAT_SHAR_DUMP (ARCHIVE_FORMAT_SHAR | 2) +#define ARCHIVE_FORMAT_TAR 0x30000 +#define ARCHIVE_FORMAT_TAR_USTAR (ARCHIVE_FORMAT_TAR | 1) +#define ARCHIVE_FORMAT_TAR_PAX_INTERCHANGE (ARCHIVE_FORMAT_TAR | 2) +#define ARCHIVE_FORMAT_TAR_PAX_RESTRICTED (ARCHIVE_FORMAT_TAR | 3) +#define ARCHIVE_FORMAT_TAR_GNUTAR (ARCHIVE_FORMAT_TAR | 4) +#define ARCHIVE_FORMAT_ISO9660 0x40000 +#define ARCHIVE_FORMAT_ISO9660_ROCKRIDGE (ARCHIVE_FORMAT_ISO9660 | 1) +#define ARCHIVE_FORMAT_ZIP 0x50000 +#define ARCHIVE_FORMAT_EMPTY 0x60000 +#define ARCHIVE_FORMAT_AR 0x70000 +#define ARCHIVE_FORMAT_AR_GNU (ARCHIVE_FORMAT_AR | 1) +#define ARCHIVE_FORMAT_AR_BSD (ARCHIVE_FORMAT_AR | 2) +#define ARCHIVE_FORMAT_MTREE 0x80000 +#define ARCHIVE_FORMAT_RAW 0x90000 +#define ARCHIVE_FORMAT_XAR 0xA0000 + +/*- + * Basic outline for reading an archive: + * 1) Ask archive_read_new for an archive reader object. + * 2) Update any global properties as appropriate. + * In particular, you'll certainly want to call appropriate + * archive_read_support_XXX functions. + * 3) Call archive_read_open_XXX to open the archive + * 4) Repeatedly call archive_read_next_header to get information about + * successive archive entries. Call archive_read_data to extract + * data for entries of interest. + * 5) Call archive_read_finish to end processing. + */ +__LA_DECL struct archive *archive_read_new(void); + +/* + * The archive_read_support_XXX calls enable auto-detect for this + * archive handle. They also link in the necessary support code. + * For example, if you don't want bzlib linked in, don't invoke + * support_compression_bzip2(). The "all" functions provide the + * obvious shorthand. + */ +__LA_DECL int archive_read_support_compression_all(struct archive *); +__LA_DECL int archive_read_support_compression_bzip2(struct archive *); +__LA_DECL int archive_read_support_compression_compress(struct archive *); +__LA_DECL int archive_read_support_compression_gzip(struct archive *); +__LA_DECL int archive_read_support_compression_lzma(struct archive *); +__LA_DECL int archive_read_support_compression_none(struct archive *); +__LA_DECL int archive_read_support_compression_program(struct archive *, + const char *command); +__LA_DECL int archive_read_support_compression_program_signature + (struct archive *, const char *, + const void * /* match */, size_t); + +#ifndef __minix +__LA_DECL int archive_read_support_compression_rpm(struct archive *); +#endif +__LA_DECL int archive_read_support_compression_uu(struct archive *); +__LA_DECL int archive_read_support_compression_xz(struct archive *); + +__LA_DECL int archive_read_support_format_all(struct archive *); +__LA_DECL int archive_read_support_format_ar(struct archive *); +#ifndef __minix +__LA_DECL int archive_read_support_format_cpio(struct archive *); +#endif +__LA_DECL int archive_read_support_format_empty(struct archive *); +__LA_DECL int archive_read_support_format_gnutar(struct archive *); +__LA_DECL int archive_read_support_format_iso9660(struct archive *); +__LA_DECL int archive_read_support_format_mtree(struct archive *); +__LA_DECL int archive_read_support_format_raw(struct archive *); +__LA_DECL int archive_read_support_format_tar(struct archive *); +__LA_DECL int archive_read_support_format_xar(struct archive *); +__LA_DECL int archive_read_support_format_zip(struct archive *); + + +/* Open the archive using callbacks for archive I/O. */ +__LA_DECL int archive_read_open(struct archive *, void *_client_data, + archive_open_callback *, archive_read_callback *, + archive_close_callback *); +__LA_DECL int archive_read_open2(struct archive *, void *_client_data, + archive_open_callback *, archive_read_callback *, + archive_skip_callback *, archive_close_callback *); + +/* + * A variety of shortcuts that invoke archive_read_open() with + * canned callbacks suitable for common situations. The ones that + * accept a block size handle tape blocking correctly. + */ +/* Use this if you know the filename. Note: NULL indicates stdin. */ +__LA_DECL int archive_read_open_filename(struct archive *, + const char *_filename, size_t _block_size); +/* archive_read_open_file() is a deprecated synonym for ..._open_filename(). */ +__LA_DECL int archive_read_open_file(struct archive *, + const char *_filename, size_t _block_size); +/* Read an archive that's stored in memory. */ +__LA_DECL int archive_read_open_memory(struct archive *, + void * buff, size_t size); +/* A more involved version that is only used for internal testing. */ +__LA_DECL int archive_read_open_memory2(struct archive *a, void *buff, + size_t size, size_t read_size); +/* Read an archive that's already open, using the file descriptor. */ +__LA_DECL int archive_read_open_fd(struct archive *, int _fd, + size_t _block_size); +/* Read an archive that's already open, using a FILE *. */ +/* Note: DO NOT use this with tape drives. */ +__LA_DECL int archive_read_open_FILE(struct archive *, FILE *_file); + +/* Parses and returns next entry header. */ +__LA_DECL int archive_read_next_header(struct archive *, + struct archive_entry **); + +/* Parses and returns next entry header using the archive_entry passed in */ +__LA_DECL int archive_read_next_header2(struct archive *, + struct archive_entry *); + +/* + * Retrieve the byte offset in UNCOMPRESSED data where last-read + * header started. + */ +#ifndef __minix +__LA_DECL __LA_INT64_T archive_read_header_position(struct archive *); +#else +__LA_DECL off_t archive_read_header_position(struct archive *); +#endif + + +/* Read data from the body of an entry. Similar to read(2). */ +__LA_DECL __LA_SSIZE_T archive_read_data(struct archive *, + void *, size_t); + +/* + * A zero-copy version of archive_read_data that also exposes the file offset + * of each returned block. Note that the client has no way to specify + * the desired size of the block. The API does guarantee that offsets will + * be strictly increasing and that returned blocks will not overlap. + */ +#if ARCHIVE_VERSION_NUMBER < 3000000 +__LA_DECL int archive_read_data_block(struct archive *a, + const void **buff, size_t *size, off_t *offset); +#else +__LA_DECL int archive_read_data_block(struct archive *a, + const void **buff, size_t *size, + __LA_INT64_T *offset); +#endif + +/*- + * Some convenience functions that are built on archive_read_data: + * 'skip': skips entire entry + * 'into_buffer': writes data into memory buffer that you provide + * 'into_fd': writes data to specified filedes + */ +__LA_DECL int archive_read_data_skip(struct archive *); +__LA_DECL int archive_read_data_into_buffer(struct archive *, + void *buffer, __LA_SSIZE_T len); +__LA_DECL int archive_read_data_into_fd(struct archive *, int fd); + +/* + * Set read options. + */ +/* Apply option string to the format only. */ +__LA_DECL int archive_read_set_format_options(struct archive *_a, + const char *s); +/* Apply option string to the filter only. */ +__LA_DECL int archive_read_set_filter_options(struct archive *_a, + const char *s); +/* Apply option string to both the format and the filter. */ +__LA_DECL int archive_read_set_options(struct archive *_a, + const char *s); + +/*- + * Convenience function to recreate the current entry (whose header + * has just been read) on disk. + * + * This does quite a bit more than just copy data to disk. It also: + * - Creates intermediate directories as required. + * - Manages directory permissions: non-writable directories will + * be initially created with write permission enabled; when the + * archive is closed, dir permissions are edited to the values specified + * in the archive. + * - Checks hardlinks: hardlinks will not be extracted unless the + * linked-to file was also extracted within the same session. (TODO) + */ + +/* The "flags" argument selects optional behavior, 'OR' the flags you want. */ + +/* Default: Do not try to set owner/group. */ +#define ARCHIVE_EXTRACT_OWNER (0x0001) +/* Default: Do obey umask, do not restore SUID/SGID/SVTX bits. */ +#define ARCHIVE_EXTRACT_PERM (0x0002) +/* Default: Do not restore mtime/atime. */ +#define ARCHIVE_EXTRACT_TIME (0x0004) +/* Default: Replace existing files. */ +#define ARCHIVE_EXTRACT_NO_OVERWRITE (0x0008) +/* Default: Try create first, unlink only if create fails with EEXIST. */ +#define ARCHIVE_EXTRACT_UNLINK (0x0010) +/* Default: Do not restore ACLs. */ +#define ARCHIVE_EXTRACT_ACL (0x0020) +/* Default: Do not restore fflags. */ +#define ARCHIVE_EXTRACT_FFLAGS (0x0040) +/* Default: Do not restore xattrs. */ +#define ARCHIVE_EXTRACT_XATTR (0x0080) +/* Default: Do not try to guard against extracts redirected by symlinks. */ +/* Note: With ARCHIVE_EXTRACT_UNLINK, will remove any intermediate symlink. */ +#define ARCHIVE_EXTRACT_SECURE_SYMLINKS (0x0100) +/* Default: Do not reject entries with '..' as path elements. */ +#define ARCHIVE_EXTRACT_SECURE_NODOTDOT (0x0200) +/* Default: Create parent directories as needed. */ +#define ARCHIVE_EXTRACT_NO_AUTODIR (0x0400) +/* Default: Overwrite files, even if one on disk is newer. */ +#define ARCHIVE_EXTRACT_NO_OVERWRITE_NEWER (0x0800) +/* Detect blocks of 0 and write holes instead. */ +#define ARCHIVE_EXTRACT_SPARSE (0x1000) + +__LA_DECL int archive_read_extract(struct archive *, struct archive_entry *, + int flags); +__LA_DECL int archive_read_extract2(struct archive *, struct archive_entry *, + struct archive * /* dest */); +__LA_DECL void archive_read_extract_set_progress_callback(struct archive *, + void (*_progress_func)(void *), void *_user_data); + +/* Record the dev/ino of a file that will not be written. This is + * generally set to the dev/ino of the archive being read. */ +__LA_DECL void archive_read_extract_set_skip_file(struct archive *, + dev_t, ino_t); + +/* Close the file and release most resources. */ +__LA_DECL int archive_read_close(struct archive *); +/* Release all resources and destroy the object. */ +/* Note that archive_read_finish will call archive_read_close for you. */ +#if ARCHIVE_VERSION_NUMBER < 2000000 +/* Erroneously declared to return void in libarchive 1.x */ +__LA_DECL void archive_read_finish(struct archive *); +#else +__LA_DECL int archive_read_finish(struct archive *); +#endif + +/*- + * To create an archive: + * 1) Ask archive_write_new for a archive writer object. + * 2) Set any global properties. In particular, you should set + * the compression and format to use. + * 3) Call archive_write_open to open the file (most people + * will use archive_write_open_file or archive_write_open_fd, + * which provide convenient canned I/O callbacks for you). + * 4) For each entry: + * - construct an appropriate struct archive_entry structure + * - archive_write_header to write the header + * - archive_write_data to write the entry data + * 5) archive_write_close to close the output + * 6) archive_write_finish to cleanup the writer and release resources + */ +__LA_DECL struct archive *archive_write_new(void); +__LA_DECL int archive_write_set_bytes_per_block(struct archive *, + int bytes_per_block); +__LA_DECL int archive_write_get_bytes_per_block(struct archive *); +/* XXX This is badly misnamed; suggestions appreciated. XXX */ +__LA_DECL int archive_write_set_bytes_in_last_block(struct archive *, + int bytes_in_last_block); +__LA_DECL int archive_write_get_bytes_in_last_block(struct archive *); + +/* The dev/ino of a file that won't be archived. This is used + * to avoid recursively adding an archive to itself. */ +__LA_DECL int archive_write_set_skip_file(struct archive *, dev_t, ino_t); + +__LA_DECL int archive_write_set_compression_bzip2(struct archive *); +__LA_DECL int archive_write_set_compression_compress(struct archive *); +__LA_DECL int archive_write_set_compression_gzip(struct archive *); +__LA_DECL int archive_write_set_compression_lzma(struct archive *); +__LA_DECL int archive_write_set_compression_none(struct archive *); +__LA_DECL int archive_write_set_compression_program(struct archive *, + const char *cmd); +__LA_DECL int archive_write_set_compression_xz(struct archive *); +/* A convenience function to set the format based on the code or name. */ +__LA_DECL int archive_write_set_format(struct archive *, int format_code); +__LA_DECL int archive_write_set_format_by_name(struct archive *, + const char *name); +/* To minimize link pollution, use one or more of the following. */ +__LA_DECL int archive_write_set_format_ar_bsd(struct archive *); +__LA_DECL int archive_write_set_format_ar_svr4(struct archive *); +#ifndef __minix +__LA_DECL int archive_write_set_format_cpio(struct archive *); +__LA_DECL int archive_write_set_format_cpio_newc(struct archive *); +#endif +__LA_DECL int archive_write_set_format_mtree(struct archive *); +/* TODO: int archive_write_set_format_old_tar(struct archive *); */ +__LA_DECL int archive_write_set_format_pax(struct archive *); +__LA_DECL int archive_write_set_format_pax_restricted(struct archive *); +__LA_DECL int archive_write_set_format_shar(struct archive *); +__LA_DECL int archive_write_set_format_shar_dump(struct archive *); +__LA_DECL int archive_write_set_format_ustar(struct archive *); +__LA_DECL int archive_write_set_format_zip(struct archive *); +__LA_DECL int archive_write_open(struct archive *, void *, + archive_open_callback *, archive_write_callback *, + archive_close_callback *); +__LA_DECL int archive_write_open_fd(struct archive *, int _fd); +__LA_DECL int archive_write_open_filename(struct archive *, const char *_file); +/* A deprecated synonym for archive_write_open_filename() */ +__LA_DECL int archive_write_open_file(struct archive *, const char *_file); +__LA_DECL int archive_write_open_FILE(struct archive *, FILE *); +/* _buffSize is the size of the buffer, _used refers to a variable that + * will be updated after each write into the buffer. */ +__LA_DECL int archive_write_open_memory(struct archive *, + void *_buffer, size_t _buffSize, size_t *_used); + +/* + * Note that the library will truncate writes beyond the size provided + * to archive_write_header or pad if the provided data is short. + */ +__LA_DECL int archive_write_header(struct archive *, + struct archive_entry *); +#if ARCHIVE_VERSION_NUMBER < 2000000 +/* This was erroneously declared to return "int" in libarchive 1.x. */ +__LA_DECL int archive_write_data(struct archive *, + const void *, size_t); +#else +/* Libarchive 2.0 and later return ssize_t here. */ +__LA_DECL __LA_SSIZE_T archive_write_data(struct archive *, + const void *, size_t); +#endif + +#if ARCHIVE_VERSION_NUMBER < 3000000 +/* Libarchive 1.x and 2.x use off_t for the argument, but that's not + * stable on Linux. */ +__LA_DECL __LA_SSIZE_T archive_write_data_block(struct archive *, + const void *, size_t, off_t); +#else +/* Libarchive 3.0 uses explicit int64_t to ensure consistent 64-bit support. */ +__LA_DECL __LA_SSIZE_T archive_write_data_block(struct archive *, + const void *, size_t, __LA_INT64_T); +#endif +__LA_DECL int archive_write_finish_entry(struct archive *); +__LA_DECL int archive_write_close(struct archive *); +#if ARCHIVE_VERSION_NUMBER < 2000000 +/* Return value was incorrect in libarchive 1.x. */ +__LA_DECL void archive_write_finish(struct archive *); +#else +/* Libarchive 2.x and later returns an error if this fails. */ +/* It can fail if the archive wasn't already closed, in which case + * archive_write_finish() will implicitly call archive_write_close(). */ +__LA_DECL int archive_write_finish(struct archive *); +#endif + +/* + * Set write options. + */ +/* Apply option string to the format only. */ +__LA_DECL int archive_write_set_format_options(struct archive *_a, + const char *s); +/* Apply option string to the compressor only. */ +__LA_DECL int archive_write_set_compressor_options(struct archive *_a, + const char *s); +/* Apply option string to both the format and the compressor. */ +__LA_DECL int archive_write_set_options(struct archive *_a, + const char *s); + + +/*- + * ARCHIVE_WRITE_DISK API + * + * To create objects on disk: + * 1) Ask archive_write_disk_new for a new archive_write_disk object. + * 2) Set any global properties. In particular, you probably + * want to set the options. + * 3) For each entry: + * - construct an appropriate struct archive_entry structure + * - archive_write_header to create the file/dir/etc on disk + * - archive_write_data to write the entry data + * 4) archive_write_finish to cleanup the writer and release resources + * + * In particular, you can use this in conjunction with archive_read() + * to pull entries out of an archive and create them on disk. + */ +__LA_DECL struct archive *archive_write_disk_new(void); +/* This file will not be overwritten. */ +__LA_DECL int archive_write_disk_set_skip_file(struct archive *, + dev_t, ino_t); +/* Set flags to control how the next item gets created. + * This accepts a bitmask of ARCHIVE_EXTRACT_XXX flags defined above. */ +__LA_DECL int archive_write_disk_set_options(struct archive *, + int flags); +/* + * The lookup functions are given uname/uid (or gname/gid) pairs and + * return a uid (gid) suitable for this system. These are used for + * restoring ownership and for setting ACLs. The default functions + * are naive, they just return the uid/gid. These are small, so reasonable + * for applications that don't need to preserve ownership; they + * are probably also appropriate for applications that are doing + * same-system backup and restore. + */ +/* + * The "standard" lookup functions use common system calls to lookup + * the uname/gname, falling back to the uid/gid if the names can't be + * found. They cache lookups and are reasonably fast, but can be very + * large, so they are not used unless you ask for them. In + * particular, these match the specifications of POSIX "pax" and old + * POSIX "tar". + */ +__LA_DECL int archive_write_disk_set_standard_lookup(struct archive *); +/* + * If neither the default (naive) nor the standard (big) functions suit + * your needs, you can write your own and register them. Be sure to + * include a cleanup function if you have allocated private data. + */ +__LA_DECL int archive_write_disk_set_group_lookup(struct archive *, + void * /* private_data */, + __LA_GID_T (*)(void *, const char *, __LA_GID_T), + void (* /* cleanup */)(void *)); +__LA_DECL int archive_write_disk_set_user_lookup(struct archive *, + void * /* private_data */, + __LA_UID_T (*)(void *, const char *, __LA_UID_T), + void (* /* cleanup */)(void *)); + +/* + * ARCHIVE_READ_DISK API + * + * This is still evolving and somewhat experimental. + */ +__LA_DECL struct archive *archive_read_disk_new(void); +/* The names for symlink modes here correspond to an old BSD + * command-line argument convention: -L, -P, -H */ +/* Follow all symlinks. */ +__LA_DECL int archive_read_disk_set_symlink_logical(struct archive *); +/* Follow no symlinks. */ +__LA_DECL int archive_read_disk_set_symlink_physical(struct archive *); +/* Follow symlink initially, then not. */ +__LA_DECL int archive_read_disk_set_symlink_hybrid(struct archive *); +/* TODO: Handle Linux stat32/stat64 ugliness. */ +__LA_DECL int archive_read_disk_entry_from_file(struct archive *, + struct archive_entry *, int /* fd */, const struct stat *); +/* Look up gname for gid or uname for uid. */ +/* Default implementations are very, very stupid. */ +__LA_DECL const char *archive_read_disk_gname(struct archive *, __LA_GID_T); +__LA_DECL const char *archive_read_disk_uname(struct archive *, __LA_UID_T); +/* "Standard" implementation uses getpwuid_r, getgrgid_r and caches the + * results for performance. */ +__LA_DECL int archive_read_disk_set_standard_lookup(struct archive *); +/* You can install your own lookups if you like. */ +__LA_DECL int archive_read_disk_set_gname_lookup(struct archive *, + void * /* private_data */, + const char *(* /* lookup_fn */)(void *, __LA_GID_T), + void (* /* cleanup_fn */)(void *)); +__LA_DECL int archive_read_disk_set_uname_lookup(struct archive *, + void * /* private_data */, + const char *(* /* lookup_fn */)(void *, __LA_UID_T), + void (* /* cleanup_fn */)(void *)); + +/* + * Accessor functions to read/set various information in + * the struct archive object: + */ +#ifndef __minix +/* Bytes written after compression or read before decompression. */ +__LA_DECL __LA_INT64_T archive_position_compressed(struct archive *); +/* Bytes written to compressor or read from decompressor. */ +__LA_DECL __LA_INT64_T archive_position_uncompressed(struct archive *); +#else +/* Bytes written after compression or read before decompression. */ +__LA_DECL off_t archive_position_compressed(struct archive *); +/* Bytes written to compressor or read from decompressor. */ +__LA_DECL off_t archive_position_uncompressed(struct archive *); +#endif + +__LA_DECL const char *archive_compression_name(struct archive *); +__LA_DECL int archive_compression(struct archive *); +__LA_DECL int archive_errno(struct archive *); +__LA_DECL const char *archive_error_string(struct archive *); +__LA_DECL const char *archive_format_name(struct archive *); +__LA_DECL int archive_format(struct archive *); +__LA_DECL void archive_clear_error(struct archive *); +__LA_DECL void archive_set_error(struct archive *, int _err, + const char *fmt, ...); +__LA_DECL void archive_copy_error(struct archive *dest, + struct archive *src); +__LA_DECL int archive_file_count(struct archive *); + +#ifdef __cplusplus +} +#endif + +/* These are meaningless outside of this header. */ +#undef __LA_DECL +#undef __LA_GID_T +#undef __LA_UID_T + +/* These need to remain defined because they're used in the + * callback type definitions. XXX Fix this. This is ugly. XXX */ +/* #undef __LA_INT64_T */ +/* #undef __LA_SSIZE_T */ + +#endif /* !ARCHIVE_H_INCLUDED */ diff --git a/lib/libarchive/archive_check_magic.c b/lib/libarchive/archive_check_magic.c new file mode 100644 index 000000000..e27e5d827 --- /dev/null +++ b/lib/libarchive/archive_check_magic.c @@ -0,0 +1,134 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_check_magic.c 201089 2009-12-28 02:20:23Z kientzle $"); + +#ifdef HAVE_SYS_TYPES_H +#include +#endif + +#include +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#ifdef HAVE_UNISTD_H +#include +#endif +#if defined(_WIN32) && !defined(__CYGWIN__) +#include +#include +#endif + +#include "archive_private.h" + +static void +errmsg(const char *m) +{ + size_t s = strlen(m); + ssize_t written; + + while (s > 0) { + written = write(2, m, strlen(m)); + if (written <= 0) + return; + m += written; + s -= written; + } +} + +static void +diediedie(void) +{ +#if defined(_WIN32) && !defined(__CYGWIN__) && defined(_DEBUG) + /* Cause a breakpoint exception */ + DebugBreak(); +#endif + abort(); /* Terminate the program abnormally. */ +} + +static const char * +state_name(unsigned s) +{ + switch (s) { + case ARCHIVE_STATE_NEW: return ("new"); + case ARCHIVE_STATE_HEADER: return ("header"); + case ARCHIVE_STATE_DATA: return ("data"); + case ARCHIVE_STATE_EOF: return ("eof"); + case ARCHIVE_STATE_CLOSED: return ("closed"); + case ARCHIVE_STATE_FATAL: return ("fatal"); + default: return ("??"); + } +} + + +static void +write_all_states(unsigned int states) +{ + unsigned int lowbit; + + /* A trick for computing the lowest set bit. */ + while ((lowbit = states & (1 + ~states)) != 0) { + states &= ~lowbit; /* Clear the low bit. */ + errmsg(state_name(lowbit)); + if (states != 0) + errmsg("/"); + } +} + +/* + * Check magic value and current state; bail if it isn't valid. + * + * This is designed to catch serious programming errors that violate + * the libarchive API. + */ +void +__archive_check_magic(struct archive *a, unsigned int magic, + unsigned int state, const char *function) +{ + if (a->magic != magic) { + errmsg("INTERNAL ERROR: Function "); + errmsg(function); + errmsg(" invoked with invalid struct archive structure.\n"); + diediedie(); + } + + if (state == ARCHIVE_STATE_ANY) + return; + + if ((a->state & state) == 0) { + errmsg("INTERNAL ERROR: Function '"); + errmsg(function); + errmsg("' invoked with archive structure in state '"); + write_all_states(a->state); + errmsg("', should be in state '"); + write_all_states(state); + errmsg("'\n"); + diediedie(); + } +} diff --git a/lib/libarchive/archive_crc32.h b/lib/libarchive/archive_crc32.h new file mode 100644 index 000000000..103e5df35 --- /dev/null +++ b/lib/libarchive/archive_crc32.h @@ -0,0 +1,66 @@ +/*- + * Copyright (c) 2009 Joerg Sonnenberger + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD: head/lib/libarchive/archive_crc32.h 201102 2009-12-28 03:11:36Z kientzle $ + */ + +#ifndef __LIBARCHIVE_BUILD +#error This header is only to be used internally to libarchive. +#endif + +/* + * When zlib is unavailable, we should still be able to validate + * uncompressed zip archives. That requires us to be able to compute + * the CRC32 check value. This is a drop-in compatible replacement + * for crc32() from zlib. It's slower than the zlib implementation, + * but still pretty fast: This runs about 300MB/s on my 3GHz P4 + * compared to about 800MB/s for the zlib implementation. + */ +static unsigned long +crc32(unsigned long crc, const void *_p, size_t len) +{ + unsigned long crc2, b, i; + const unsigned char *p = _p; + static volatile int crc_tbl_inited = 0; + static unsigned long crc_tbl[256]; + + if (!crc_tbl_inited) { + for (b = 0; b < 256; ++b) { + crc2 = b; + for (i = 8; i > 0; --i) { + if (crc2 & 1) + crc2 = (crc2 >> 1) ^ 0xedb88320UL; + else + crc2 = (crc2 >> 1); + } + crc_tbl[b] = crc2; + } + crc_tbl_inited = 1; + } + + crc = crc ^ 0xffffffffUL; + while (len--) + crc = crc_tbl[(crc ^ *p++) & 0xff] ^ (crc >> 8); + return (crc ^ 0xffffffffUL); +} diff --git a/lib/libarchive/archive_endian.h b/lib/libarchive/archive_endian.h new file mode 100644 index 000000000..30577d1f5 --- /dev/null +++ b/lib/libarchive/archive_endian.h @@ -0,0 +1,207 @@ +/*- + * Copyright (c) 2002 Thomas Moestl + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD: head/lib/libarchive/archive_endian.h 201085 2009-12-28 02:17:15Z kientzle $ + * + * Borrowed from FreeBSD's + */ + +#ifndef __LIBARCHIVE_BUILD +#error This header is only to be used internally to libarchive. +#endif + +/* Note: This is a purely internal header! */ +/* Do not use this outside of libarchive internal code! */ + +#ifndef ARCHIVE_ENDIAN_H_INCLUDED +#define ARCHIVE_ENDIAN_H_INCLUDED + + +/* + * Disabling inline keyword for compilers known to choke on it: + * - Watcom C++ in C code. (For any version?) + * - SGI MIPSpro + * - Microsoft Visual C++ 6.0 (supposedly newer versions too) + */ +#if defined(__WATCOMC__) || defined(__sgi) || defined(__hpux) || defined(__BORLANDC__) || defined(__ACK__) +#define inline +#elif defined(_MSC_VER) +#define inline __inline +#endif + +#ifdef __minix +#include +#endif + +/* Alignment-agnostic encode/decode bytestream to/from little/big endian. */ + +static inline uint16_t +archive_be16dec(const void *pp) +{ + unsigned char const *p = (unsigned char const *)pp; + + return ((p[0] << 8) | p[1]); +} + +static inline uint32_t +archive_be32dec(const void *pp) +{ + unsigned char const *p = (unsigned char const *)pp; + + return ((p[0] << 24) | (p[1] << 16) | (p[2] << 8) | p[3]); +} + +#ifndef __minix +static inline uint64_t +archive_be64dec(const void *pp) +{ + unsigned char const *p = (unsigned char const *)pp; + + return (((uint64_t)archive_be32dec(p) << 32) | archive_be32dec(p + 4)); +} +#else +static inline u64_t +archive_be64dec(const void *pp) +{ + unsigned char const *p = (unsigned char const *)pp; + + return make64(archive_be32dec(p + 4), archive_be32dec(p)); +} +#endif + +static inline uint16_t +archive_le16dec(const void *pp) +{ + unsigned char const *p = (unsigned char const *)pp; + + return ((p[1] << 8) | p[0]); +} + +static inline uint32_t +archive_le32dec(const void *pp) +{ + unsigned char const *p = (unsigned char const *)pp; + + return ((p[3] << 24) | (p[2] << 16) | (p[1] << 8) | p[0]); +} + +#ifndef __minix +static inline uint64_t +archive_le64dec(const void *pp) +{ + unsigned char const *p = (unsigned char const *)pp; + + return (((uint64_t)archive_le32dec(p + 4) << 32) | archive_le32dec(p)); +} +#else +static inline u64_t +archive_le64dec(const void *pp) +{ + unsigned char const *p = (unsigned char const *)pp; + + return make64(archive_le32dec(p), archive_le32dec(p + 4)); +} +#endif + +static inline void +archive_be16enc(void *pp, uint16_t u) +{ + unsigned char *p = (unsigned char *)pp; + + p[0] = (u >> 8) & 0xff; + p[1] = u & 0xff; +} + +static inline void +archive_be32enc(void *pp, uint32_t u) +{ + unsigned char *p = (unsigned char *)pp; + + p[0] = (u >> 24) & 0xff; + p[1] = (u >> 16) & 0xff; + p[2] = (u >> 8) & 0xff; + p[3] = u & 0xff; +} + +#ifndef __minix +static inline void +archive_be64enc(void *pp, uint64_t u) +{ + unsigned char *p = (unsigned char *)pp; + + archive_be32enc(p, u >> 32); + archive_be32enc(p + 4, u & 0xffffffff); +} +#else +static inline void +archive_be64enc(void *pp, u64_t u) +{ + unsigned char *p = (unsigned char *)pp; + + archive_be32enc(p, ex64hi(u)); + archive_be32enc(p + 4, ex64lo(u)); +} +#endif + +static inline void +archive_le16enc(void *pp, uint16_t u) +{ + unsigned char *p = (unsigned char *)pp; + + p[0] = u & 0xff; + p[1] = (u >> 8) & 0xff; +} + +static inline void +archive_le32enc(void *pp, uint32_t u) +{ + unsigned char *p = (unsigned char *)pp; + + p[0] = u & 0xff; + p[1] = (u >> 8) & 0xff; + p[2] = (u >> 16) & 0xff; + p[3] = (u >> 24) & 0xff; +} + +#ifndef __minix +static inline void +archive_le64enc(void *pp, uint64_t u) +{ + unsigned char *p = (unsigned char *)pp; + + archive_le32enc(p, u & 0xffffffff); + archive_le32enc(p + 4, u >> 32); +} +#else +static inline void +archive_le64enc(void *pp, u64_t u) +{ + unsigned char *p = (unsigned char *)pp; + + archive_le32enc(p, ex64lo(u)); + archive_le32enc(p + 4, ex64hi(u)); +} +#endif +#endif diff --git a/lib/libarchive/archive_entry.3 b/lib/libarchive/archive_entry.3 new file mode 100644 index 000000000..9ceb18b7a --- /dev/null +++ b/lib/libarchive/archive_entry.3 @@ -0,0 +1,433 @@ +.\" Copyright (c) 2003-2007 Tim Kientzle +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD: src/lib/libarchive/archive_entry.3,v 1.18 2008/05/26 17:00:22 kientzle Exp $ +.\" +.Dd May 12, 2008 +.Dt archive_entry 3 +.Os +.Sh NAME +.Nm archive_entry_acl_add_entry , +.Nm archive_entry_acl_add_entry_w , +.Nm archive_entry_acl_clear , +.Nm archive_entry_acl_count , +.Nm archive_entry_acl_next , +.Nm archive_entry_acl_next_w , +.Nm archive_entry_acl_reset , +.Nm archive_entry_acl_text_w , +.Nm archive_entry_atime , +.Nm archive_entry_atime_nsec , +.Nm archive_entry_clear , +.Nm archive_entry_clone , +.Nm archive_entry_copy_fflags_text , +.Nm archive_entry_copy_fflags_text_w , +.Nm archive_entry_copy_gname , +.Nm archive_entry_copy_gname_w , +.Nm archive_entry_copy_hardlink , +.Nm archive_entry_copy_hardlink_w , +.Nm archive_entry_copy_link , +.Nm archive_entry_copy_link_w , +.Nm archive_entry_copy_pathname_w , +.Nm archive_entry_copy_sourcepath , +.Nm archive_entry_copy_stat , +.Nm archive_entry_copy_symlink , +.Nm archive_entry_copy_symlink_w , +.Nm archive_entry_copy_uname , +.Nm archive_entry_copy_uname_w , +.Nm archive_entry_dev , +.Nm archive_entry_devmajor , +.Nm archive_entry_devminor , +.Nm archive_entry_filetype , +.Nm archive_entry_fflags , +.Nm archive_entry_fflags_text , +.Nm archive_entry_free , +.Nm archive_entry_gid , +.Nm archive_entry_gname , +.Nm archive_entry_hardlink , +.Nm archive_entry_ino , +.Nm archive_entry_mode , +.Nm archive_entry_mtime , +.Nm archive_entry_mtime_nsec , +.Nm archive_entry_nlink , +.Nm archive_entry_new , +.Nm archive_entry_pathname , +.Nm archive_entry_pathname_w , +.Nm archive_entry_rdev , +.Nm archive_entry_rdevmajor , +.Nm archive_entry_rdevminor , +.Nm archive_entry_set_atime , +.Nm archive_entry_set_ctime , +.Nm archive_entry_set_dev , +.Nm archive_entry_set_devmajor , +.Nm archive_entry_set_devminor , +.Nm archive_entry_set_filetype , +.Nm archive_entry_set_fflags , +.Nm archive_entry_set_gid , +.Nm archive_entry_set_gname , +.Nm archive_entry_set_hardlink , +.Nm archive_entry_set_link , +.Nm archive_entry_set_mode , +.Nm archive_entry_set_mtime , +.Nm archive_entry_set_pathname , +.Nm archive_entry_set_rdevmajor , +.Nm archive_entry_set_rdevminor , +.Nm archive_entry_set_size , +.Nm archive_entry_set_symlink , +.Nm archive_entry_set_uid , +.Nm archive_entry_set_uname , +.Nm archive_entry_size , +.Nm archive_entry_sourcepath , +.Nm archive_entry_stat , +.Nm archive_entry_symlink , +.Nm archive_entry_uid , +.Nm archive_entry_uname +.Nd functions for manipulating archive entry descriptions +.Sh SYNOPSIS +.In archive_entry.h +.Ft void +.Fo archive_entry_acl_add_entry +.Fa "struct archive_entry *" +.Fa "int type" +.Fa "int permset" +.Fa "int tag" +.Fa "int qual" +.Fa "const char *name" +.Fc +.Ft void +.Fo archive_entry_acl_add_entry_w +.Fa "struct archive_entry *" +.Fa "int type" +.Fa "int permset" +.Fa "int tag" +.Fa "int qual" +.Fa "const wchar_t *name" +.Fc +.Ft void +.Fn archive_entry_acl_clear "struct archive_entry *" +.Ft int +.Fn archive_entry_acl_count "struct archive_entry *" "int type" +.Ft int +.Fo archive_entry_acl_next +.Fa "struct archive_entry *" +.Fa "int want_type" +.Fa "int *type" +.Fa "int *permset" +.Fa "int *tag" +.Fa "int *qual" +.Fa "const char **name" +.Fc +.Ft int +.Fo archive_entry_acl_next_w +.Fa "struct archive_entry *" +.Fa "int want_type" +.Fa "int *type" +.Fa "int *permset" +.Fa "int *tag" +.Fa "int *qual" +.Fa "const wchar_t **name" +.Fc +.Ft int +.Fn archive_entry_acl_reset "struct archive_entry *" "int want_type" +.Ft const wchar_t * +.Fn archive_entry_acl_text_w "struct archive_entry *" "int flags" +.Ft time_t +.Fn archive_entry_atime "struct archive_entry *" +.Ft long +.Fn archive_entry_atime_nsec "struct archive_entry *" +.Ft "struct archive_entry *" +.Fn archive_entry_clear "struct archive_entry *" +.Ft struct archive_entry * +.Fn archive_entry_clone "struct archive_entry *" +.Ft const char * * +.Fn archive_entry_copy_fflags_text_w "struct archive_entry *" "const char *" +.Ft const wchar_t * +.Fn archive_entry_copy_fflags_text_w "struct archive_entry *" "const wchar_t *" +.Ft void +.Fn archive_entry_copy_gname "struct archive_entry *" "const char *" +.Ft void +.Fn archive_entry_copy_gname_w "struct archive_entry *" "const wchar_t *" +.Ft void +.Fn archive_entry_copy_hardlink "struct archive_entry *" "const char *" +.Ft void +.Fn archive_entry_copy_hardlink_w "struct archive_entry *" "const wchar_t *" +.Ft void +.Fn archive_entry_copy_sourcepath "struct archive_entry *" "const char *" +.Ft void +.Fn archive_entry_copy_pathname_w "struct archive_entry *" "const wchar_t *" +.Ft void +.Fn archive_entry_copy_stat "struct archive_entry *" "const struct stat *" +.Ft void +.Fn archive_entry_copy_symlink "struct archive_entry *" "const char *" +.Ft void +.Fn archive_entry_copy_symlink_w "struct archive_entry *" "const wchar_t *" +.Ft void +.Fn archive_entry_copy_uname "struct archive_entry *" "const char *" +.Ft void +.Fn archive_entry_copy_uname_w "struct archive_entry *" "const wchar_t *" +.Ft dev_t +.Fn archive_entry_dev "struct archive_entry *" +.Ft dev_t +.Fn archive_entry_devmajor "struct archive_entry *" +.Ft dev_t +.Fn archive_entry_devminor "struct archive_entry *" +.Ft mode_t +.Fn archive_entry_filetype "struct archive_entry *" +.Ft void +.Fo archive_entry_fflags +.Fa "struct archive_entry *" +.Fa "unsigned long *set" +.Fa "unsigned long *clear" +.Fc +.Ft const char * +.Fn archive_entry_fflags_text "struct archive_entry *" +.Ft void +.Fn archive_entry_free "struct archive_entry *" +.Ft const char * +.Fn archive_entry_gname "struct archive_entry *" +.Ft const char * +.Fn archive_entry_hardlink "struct archive_entry *" +.Ft ino_t +.Fn archive_entry_ino "struct archive_entry *" +.Ft mode_t +.Fn archive_entry_mode "struct archive_entry *" +.Ft time_t +.Fn archive_entry_mtime "struct archive_entry *" +.Ft long +.Fn archive_entry_mtime_nsec "struct archive_entry *" +.Ft unsigned int +.Fn archive_entry_nlink "struct archive_entry *" +.Ft struct archive_entry * +.Fn archive_entry_new "void" +.Ft const char * +.Fn archive_entry_pathname "struct archive_entry *" +.Ft const wchar_t * +.Fn archive_entry_pathname_w "struct archive_entry *" +.Ft dev_t +.Fn archive_entry_rdev "struct archive_entry *" +.Ft dev_t +.Fn archive_entry_rdevmajor "struct archive_entry *" +.Ft dev_t +.Fn archive_entry_rdevminor "struct archive_entry *" +.Ft void +.Fn archive_entry_set_dev "struct archive_entry *" "dev_t" +.Ft void +.Fn archive_entry_set_devmajor "struct archive_entry *" "dev_t" +.Ft void +.Fn archive_entry_set_devminor "struct archive_entry *" "dev_t" +.Ft void +.Fn archive_entry_set_filetype "struct archive_entry *" "unsigned int" +.Ft void +.Fo archive_entry_set_fflags +.Fa "struct archive_entry *" +.Fa "unsigned long set" +.Fa "unsigned long clear" +.Fc +.Ft void +.Fn archive_entry_set_gid "struct archive_entry *" "gid_t" +.Ft void +.Fn archive_entry_set_gname "struct archive_entry *" "const char *" +.Ft void +.Fn archive_entry_set_hardlink "struct archive_entry *" "const char *" +.Ft void +.Fn archive_entry_set_ino "struct archive_entry *" "unsigned long" +.Ft void +.Fn archive_entry_set_link "struct archive_entry *" "const char *" +.Ft void +.Fn archive_entry_set_mode "struct archive_entry *" "mode_t" +.Ft void +.Fn archive_entry_set_mtime "struct archive_entry *" "time_t" "long nanos" +.Ft void +.Fn archive_entry_set_nlink "struct archive_entry *" "unsigned int" +.Ft void +.Fn archive_entry_set_pathname "struct archive_entry *" "const char *" +.Ft void +.Fn archive_entry_set_rdev "struct archive_entry *" "dev_t" +.Ft void +.Fn archive_entry_set_rdevmajor "struct archive_entry *" "dev_t" +.Ft void +.Fn archive_entry_set_rdevminor "struct archive_entry *" "dev_t" +.Ft void +.Fn archive_entry_set_size "struct archive_entry *" "int64_t" +.Ft void +.Fn archive_entry_set_symlink "struct archive_entry *" "const char *" +.Ft void +.Fn archive_entry_set_uid "struct archive_entry *" "uid_t" +.Ft void +.Fn archive_entry_set_uname "struct archive_entry *" "const char *" +.Ft int64_t +.Fn archive_entry_size "struct archive_entry *" +.Ft const char * +.Fn archive_entry_sourcepath "struct archive_entry *" +.Ft const struct stat * +.Fn archive_entry_stat "struct archive_entry *" +.Ft const char * +.Fn archive_entry_symlink "struct archive_entry *" +.Ft const char * +.Fn archive_entry_uname "struct archive_entry *" +.Sh DESCRIPTION +These functions create and manipulate data objects that +represent entries within an archive. +You can think of a +.Tn struct archive_entry +as a heavy-duty version of +.Tn struct stat : +it includes everything from +.Tn struct stat +plus associated pathname, textual group and user names, etc. +These objects are used by +.Xr libarchive 3 +to represent the metadata associated with a particular +entry in an archive. +.Ss Create and Destroy +There are functions to allocate, destroy, clear, and copy +.Va archive_entry +objects: +.Bl -tag -compact -width indent +.It Fn archive_entry_clear +Erases the object, resetting all internal fields to the +same state as a newly-created object. +This is provided to allow you to quickly recycle objects +without thrashing the heap. +.It Fn archive_entry_clone +A deep copy operation; all text fields are duplicated. +.It Fn archive_entry_free +Releases the +.Tn struct archive_entry +object. +.It Fn archive_entry_new +Allocate and return a blank +.Tn struct archive_entry +object. +.El +.Ss Set and Get Functions +Most of the functions here set or read entries in an object. +Such functions have one of the following forms: +.Bl -tag -compact -width indent +.It Fn archive_entry_set_XXXX +Stores the provided data in the object. +In particular, for strings, the pointer is stored, +not the referenced string. +.It Fn archive_entry_copy_XXXX +As above, except that the referenced data is copied +into the object. +.It Fn archive_entry_XXXX +Returns the specified data. +In the case of strings, a const-qualified pointer to +the string is returned. +.El +String data can be set or accessed as wide character strings +or normal +.Va char +strings. +The functions that use wide character strings are suffixed with +.Cm _w . +Note that these are different representations of the same data: +For example, if you store a narrow string and read the corresponding +wide string, the object will transparently convert formats +using the current locale. +Similarly, if you store a wide string and then store a +narrow string for the same data, the previously-set wide string will +be discarded in favor of the new data. +.Pp +There are a few set/get functions that merit additional description: +.Bl -tag -compact -width indent +.It Fn archive_entry_set_link +This function sets the symlink field if it is already set. +Otherwise, it sets the hardlink field. +.El +.Ss File Flags +File flags are transparently converted between a bitmap +representation and a textual format. +For example, if you set the bitmap and ask for text, the library +will build a canonical text format. +However, if you set a text format and request a text format, +you will get back the same text, even if it is ill-formed. +If you need to canonicalize a textual flags string, you should first set the +text form, then request the bitmap form, then use that to set the bitmap form. +Setting the bitmap format will clear the internal text representation +and force it to be reconstructed when you next request the text form. +.Pp +The bitmap format consists of two integers, one containing bits +that should be set, the other specifying bits that should be +cleared. +Bits not mentioned in either bitmap will be ignored. +Usually, the bitmap of bits to be cleared will be set to zero. +In unusual circumstances, you can force a fully-specified set +of file flags by setting the bitmap of flags to clear to the complement +of the bitmap of flags to set. +(This differs from +.Xr fflagstostr 3 , +which only includes names for set bits.) +Converting a bitmap to a textual string is a platform-specific +operation; bits that are not meaningful on the current platform +will be ignored. +.Pp +The canonical text format is a comma-separated list of flag names. +The +.Fn archive_entry_copy_fflags_text +and +.Fn archive_entry_copy_fflags_text_w +functions parse the provided text and sets the internal bitmap values. +This is a platform-specific operation; names that are not meaningful +on the current platform will be ignored. +The function returns a pointer to the start of the first name that was not +recognized, or NULL if every name was recognized. +Note that every name--including names that follow an unrecognized name--will +be evaluated, and the bitmaps will be set to reflect every name that is +recognized. +(In particular, this differs from +.Xr strtofflags 3 , +which stops parsing at the first unrecognized name.) +.Ss ACL Handling +XXX This needs serious help. +XXX +.Pp +An +.Dq Access Control List +(ACL) is a list of permissions that grant access to particular users or +groups beyond what would normally be provided by standard POSIX mode bits. +The ACL handling here addresses some deficiencies in the POSIX.1e draft 17 ACL +specification. +In particular, POSIX.1e draft 17 specifies several different formats, but +none of those formats include both textual user/group names and numeric +UIDs/GIDs. +.Pp +XXX explain ACL stuff XXX +.\" .Sh EXAMPLE +.\" .Sh RETURN VALUES +.\" .Sh ERRORS +.Sh SEE ALSO +.Xr archive 3 +.Sh HISTORY +The +.Nm libarchive +library first appeared in +.Fx 5.3 . +.Sh AUTHORS +.An -nosplit +The +.Nm libarchive +library was written by +.An Tim Kientzle Aq kientzle@acm.org . +.\" .Sh BUGS diff --git a/lib/libarchive/archive_entry.c b/lib/libarchive/archive_entry.c new file mode 100644 index 000000000..a2364c05a --- /dev/null +++ b/lib/libarchive/archive_entry.c @@ -0,0 +1,2225 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_entry.c 201096 2009-12-28 02:41:27Z kientzle $"); + +#ifdef HAVE_SYS_STAT_H +#include +#endif +#ifdef HAVE_SYS_TYPES_H +#include +#endif +#if MAJOR_IN_MKDEV +#include +#define HAVE_MAJOR +#elif MAJOR_IN_SYSMACROS +#include +#define HAVE_MAJOR +#endif +#ifdef HAVE_LIMITS_H +#include +#endif +#ifdef HAVE_LINUX_FS_H +#include /* for Linux file flags */ +#endif +/* + * Some Linux distributions have both linux/ext2_fs.h and ext2fs/ext2_fs.h. + * As the include guards don't agree, the order of include is important. + */ +#ifdef HAVE_LINUX_EXT2_FS_H +#include /* for Linux file flags */ +#endif +#if defined(HAVE_EXT2FS_EXT2_FS_H) && !defined(__CYGWIN__) +#include /* for Linux file flags */ +#endif +#include +#include +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#ifdef HAVE_WCHAR_H +#include +#endif + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" +#include "archive_entry_private.h" + +#undef max +#define max(a, b) ((a)>(b)?(a):(b)) + +#if !defined(HAVE_MAJOR) && !defined(major) +/* Replacement for major/minor/makedev. */ +#define major(x) ((int)(0x00ff & ((x) >> 8))) +#define minor(x) ((int)(0xffff00ff & (x))) +#define makedev(maj,min) ((0xff00 & ((maj)<<8)) | (0xffff00ff & (min))) +#endif + +/* Play games to come up with a suitable makedev() definition. */ +#ifdef __QNXNTO__ +/* QNX. */ +#include +#define ae_makedev(maj, min) makedev(ND_LOCAL_NODE, (maj), (min)) +#elif defined makedev +/* There's a "makedev" macro. */ +#define ae_makedev(maj, min) makedev((maj), (min)) +#elif defined mkdev || ((defined _WIN32 || defined __WIN32__) && !defined(__CYGWIN__)) +/* Windows. */ +#define ae_makedev(maj, min) mkdev((maj), (min)) +#else +/* There's a "makedev" function. */ +#define ae_makedev(maj, min) makedev((maj), (min)) +#endif + +static void aes_clean(struct aes *); +static void aes_copy(struct aes *dest, struct aes *src); +static const char * aes_get_mbs(struct aes *); +static const wchar_t * aes_get_wcs(struct aes *); +static int aes_set_mbs(struct aes *, const char *mbs); +static int aes_copy_mbs(struct aes *, const char *mbs); +/* static void aes_set_wcs(struct aes *, const wchar_t *wcs); */ +static int aes_copy_wcs(struct aes *, const wchar_t *wcs); +static int aes_copy_wcs_len(struct aes *, const wchar_t *wcs, size_t); + +static char * ae_fflagstostr(unsigned long bitset, unsigned long bitclear); +static const wchar_t *ae_wcstofflags(const wchar_t *stringp, + unsigned long *setp, unsigned long *clrp); +static const char *ae_strtofflags(const char *stringp, + unsigned long *setp, unsigned long *clrp); +static void append_entry_w(wchar_t **wp, const wchar_t *prefix, int tag, + const wchar_t *wname, int perm, int id); +static void append_id_w(wchar_t **wp, int id); + +static int acl_special(struct archive_entry *entry, + int type, int permset, int tag); +static struct ae_acl *acl_new_entry(struct archive_entry *entry, + int type, int permset, int tag, int id); +static int isint_w(const wchar_t *start, const wchar_t *end, int *result); +static int ismode_w(const wchar_t *start, const wchar_t *end, int *result); +static void next_field_w(const wchar_t **wp, const wchar_t **start, + const wchar_t **end, wchar_t *sep); +static int prefix_w(const wchar_t *start, const wchar_t *end, + const wchar_t *test); +static void +archive_entry_acl_add_entry_w_len(struct archive_entry *entry, int type, + int permset, int tag, int id, const wchar_t *name, size_t); + + +#ifndef HAVE_WCSCPY +static wchar_t * wcscpy(wchar_t *s1, const wchar_t *s2) +{ + wchar_t *dest = s1; + while ((*s1 = *s2) != L'\0') + ++s1, ++s2; + return dest; +} +#endif +#ifndef HAVE_WCSLEN +static size_t wcslen(const wchar_t *s) +{ + const wchar_t *p = s; + while (*p != L'\0') + ++p; + return p - s; +} +#endif +#ifndef HAVE_WMEMCMP +/* Good enough for simple equality testing, but not for sorting. */ +#define wmemcmp(a,b,i) memcmp((a), (b), (i) * sizeof(wchar_t)) +#endif +#ifndef HAVE_WMEMCPY +#define wmemcpy(a,b,i) (wchar_t *)memcpy((a), (b), (i) * sizeof(wchar_t)) +#endif + +static void +aes_clean(struct aes *aes) +{ + if (aes->aes_wcs) { + free((wchar_t *)(uintptr_t)aes->aes_wcs); + aes->aes_wcs = NULL; + } + archive_string_free(&(aes->aes_mbs)); + archive_string_free(&(aes->aes_utf8)); + aes->aes_set = 0; +} + +static void +aes_copy(struct aes *dest, struct aes *src) +{ + wchar_t *wp; + + dest->aes_set = src->aes_set; + archive_string_copy(&(dest->aes_mbs), &(src->aes_mbs)); + archive_string_copy(&(dest->aes_utf8), &(src->aes_utf8)); + + if (src->aes_wcs != NULL) { + wp = (wchar_t *)malloc((wcslen(src->aes_wcs) + 1) + * sizeof(wchar_t)); + if (wp == NULL) + __archive_errx(1, "No memory for aes_copy()"); + wcscpy(wp, src->aes_wcs); + dest->aes_wcs = wp; + } +} + +static const char * +aes_get_utf8(struct aes *aes) +{ + if (aes->aes_set & AES_SET_UTF8) + return (aes->aes_utf8.s); + if ((aes->aes_set & AES_SET_WCS) + && archive_strappend_w_utf8(&(aes->aes_utf8), aes->aes_wcs) != NULL) { + aes->aes_set |= AES_SET_UTF8; + return (aes->aes_utf8.s); + } + return (NULL); +} + +static const char * +aes_get_mbs(struct aes *aes) +{ + /* If we already have an MBS form, return that immediately. */ + if (aes->aes_set & AES_SET_MBS) + return (aes->aes_mbs.s); + /* If there's a WCS form, try converting with the native locale. */ + if ((aes->aes_set & AES_SET_WCS) + && archive_strappend_w_mbs(&(aes->aes_mbs), aes->aes_wcs) != NULL) { + aes->aes_set |= AES_SET_MBS; + return (aes->aes_mbs.s); + } + /* We'll use UTF-8 for MBS if all else fails. */ + return (aes_get_utf8(aes)); +} + +static const wchar_t * +aes_get_wcs(struct aes *aes) +{ + wchar_t *w; + size_t r; + + /* Return WCS form if we already have it. */ + if (aes->aes_set & AES_SET_WCS) + return (aes->aes_wcs); + + if (aes->aes_set & AES_SET_MBS) { + /* Try converting MBS to WCS using native locale. */ + /* + * No single byte will be more than one wide character, + * so this length estimate will always be big enough. + */ + size_t wcs_length = aes->aes_mbs.length; + + w = (wchar_t *)malloc((wcs_length + 1) * sizeof(wchar_t)); + if (w == NULL) + __archive_errx(1, "No memory for aes_get_wcs()"); + r = mbstowcs(w, aes->aes_mbs.s, wcs_length); + if (r != (size_t)-1 && r != 0) { + w[r] = 0; + aes->aes_set |= AES_SET_WCS; + return (aes->aes_wcs = w); + } + free(w); + } + + if (aes->aes_set & AES_SET_UTF8) { + /* Try converting UTF8 to WCS. */ + aes->aes_wcs = __archive_string_utf8_w(&(aes->aes_utf8)); + if (aes->aes_wcs != NULL) + aes->aes_set |= AES_SET_WCS; + return (aes->aes_wcs); + } + return (NULL); +} + +static int +aes_set_mbs(struct aes *aes, const char *mbs) +{ + return (aes_copy_mbs(aes, mbs)); +} + +static int +aes_copy_mbs(struct aes *aes, const char *mbs) +{ + if (mbs == NULL) { + aes->aes_set = 0; + return (0); + } + aes->aes_set = AES_SET_MBS; /* Only MBS form is set now. */ + archive_strcpy(&(aes->aes_mbs), mbs); + archive_string_empty(&(aes->aes_utf8)); + if (aes->aes_wcs) { + free((wchar_t *)(uintptr_t)aes->aes_wcs); + aes->aes_wcs = NULL; + } + return (0); +} + +/* + * The 'update' form tries to proactively update all forms of + * this string (WCS and MBS) and returns an error if any of + * them fail. This is used by the 'pax' handler, for instance, + * to detect and report character-conversion failures early while + * still allowing clients to get potentially useful values from + * the more tolerant lazy conversions. (get_mbs and get_wcs will + * strive to give the user something useful, so you can get hopefully + * usable values even if some of the character conversions are failing.) + */ +static int +aes_update_utf8(struct aes *aes, const char *utf8) +{ + if (utf8 == NULL) { + aes->aes_set = 0; + return (1); /* Succeeded in clearing everything. */ + } + + /* Save the UTF8 string. */ + archive_strcpy(&(aes->aes_utf8), utf8); + + /* Empty the mbs and wcs strings. */ + archive_string_empty(&(aes->aes_mbs)); + if (aes->aes_wcs) { + free((wchar_t *)(uintptr_t)aes->aes_wcs); + aes->aes_wcs = NULL; + } + + aes->aes_set = AES_SET_UTF8; /* Only UTF8 is set now. */ + + /* TODO: We should just do a direct UTF-8 to MBS conversion + * here. That would be faster, use less space, and give the + * same information. (If a UTF-8 to MBS conversion succeeds, + * then UTF-8->WCS and Unicode->MBS conversions will both + * succeed.) */ + + /* Try converting UTF8 to WCS, return false on failure. */ + aes->aes_wcs = __archive_string_utf8_w(&(aes->aes_utf8)); + if (aes->aes_wcs == NULL) + return (0); + aes->aes_set = AES_SET_UTF8 | AES_SET_WCS; /* Both UTF8 and WCS set. */ + + /* Try converting WCS to MBS, return false on failure. */ + if (archive_strappend_w_mbs(&(aes->aes_mbs), aes->aes_wcs) == NULL) + return (0); + aes->aes_set = AES_SET_UTF8 | AES_SET_WCS | AES_SET_MBS; + + /* All conversions succeeded. */ + return (1); +} + +static int +aes_copy_wcs(struct aes *aes, const wchar_t *wcs) +{ + return aes_copy_wcs_len(aes, wcs, wcs == NULL ? 0 : wcslen(wcs)); +} + +static int +aes_copy_wcs_len(struct aes *aes, const wchar_t *wcs, size_t len) +{ + wchar_t *w; + + if (wcs == NULL) { + aes->aes_set = 0; + return (0); + } + aes->aes_set = AES_SET_WCS; /* Only WCS form set. */ + archive_string_empty(&(aes->aes_mbs)); + archive_string_empty(&(aes->aes_utf8)); + if (aes->aes_wcs) { + free((wchar_t *)(uintptr_t)aes->aes_wcs); + aes->aes_wcs = NULL; + } + w = (wchar_t *)malloc((len + 1) * sizeof(wchar_t)); + if (w == NULL) + __archive_errx(1, "No memory for aes_copy_wcs()"); + wmemcpy(w, wcs, len); + w[len] = L'\0'; + aes->aes_wcs = w; + return (0); +} + +/**************************************************************************** + * + * Public Interface + * + ****************************************************************************/ + +struct archive_entry * +archive_entry_clear(struct archive_entry *entry) +{ + if (entry == NULL) + return (NULL); + aes_clean(&entry->ae_fflags_text); + aes_clean(&entry->ae_gname); + aes_clean(&entry->ae_hardlink); + aes_clean(&entry->ae_pathname); + aes_clean(&entry->ae_sourcepath); + aes_clean(&entry->ae_symlink); + aes_clean(&entry->ae_uname); + archive_entry_acl_clear(entry); + archive_entry_xattr_clear(entry); + free(entry->stat); + memset(entry, 0, sizeof(*entry)); + return entry; +} + +struct archive_entry * +archive_entry_clone(struct archive_entry *entry) +{ + struct archive_entry *entry2; + struct ae_acl *ap, *ap2; + struct ae_xattr *xp; + + /* Allocate new structure and copy over all of the fields. */ + entry2 = (struct archive_entry *)malloc(sizeof(*entry2)); + if (entry2 == NULL) + return (NULL); + memset(entry2, 0, sizeof(*entry2)); + entry2->ae_stat = entry->ae_stat; + entry2->ae_fflags_set = entry->ae_fflags_set; + entry2->ae_fflags_clear = entry->ae_fflags_clear; + + aes_copy(&entry2->ae_fflags_text, &entry->ae_fflags_text); + aes_copy(&entry2->ae_gname, &entry->ae_gname); + aes_copy(&entry2->ae_hardlink, &entry->ae_hardlink); + aes_copy(&entry2->ae_pathname, &entry->ae_pathname); + aes_copy(&entry2->ae_sourcepath, &entry->ae_sourcepath); + aes_copy(&entry2->ae_symlink, &entry->ae_symlink); + entry2->ae_set = entry->ae_set; + aes_copy(&entry2->ae_uname, &entry->ae_uname); + + /* Copy ACL data over. */ + ap = entry->acl_head; + while (ap != NULL) { + ap2 = acl_new_entry(entry2, + ap->type, ap->permset, ap->tag, ap->id); + if (ap2 != NULL) + aes_copy(&ap2->name, &ap->name); + ap = ap->next; + } + + /* Copy xattr data over. */ + xp = entry->xattr_head; + while (xp != NULL) { + archive_entry_xattr_add_entry(entry2, + xp->name, xp->value, xp->size); + xp = xp->next; + } + + return (entry2); +} + +void +archive_entry_free(struct archive_entry *entry) +{ + archive_entry_clear(entry); + free(entry); +} + +struct archive_entry * +archive_entry_new(void) +{ + struct archive_entry *entry; + + entry = (struct archive_entry *)malloc(sizeof(*entry)); + if (entry == NULL) + return (NULL); + memset(entry, 0, sizeof(*entry)); + return (entry); +} + +/* + * Functions for reading fields from an archive_entry. + */ + +time_t +archive_entry_atime(struct archive_entry *entry) +{ + return (entry->ae_stat.aest_atime); +} + +long +archive_entry_atime_nsec(struct archive_entry *entry) +{ + return (entry->ae_stat.aest_atime_nsec); +} + +int +archive_entry_atime_is_set(struct archive_entry *entry) +{ + return (entry->ae_set & AE_SET_ATIME); +} + +time_t +archive_entry_birthtime(struct archive_entry *entry) +{ + return (entry->ae_stat.aest_birthtime); +} + +long +archive_entry_birthtime_nsec(struct archive_entry *entry) +{ + return (entry->ae_stat.aest_birthtime_nsec); +} + +int +archive_entry_birthtime_is_set(struct archive_entry *entry) +{ + return (entry->ae_set & AE_SET_BIRTHTIME); +} + +time_t +archive_entry_ctime(struct archive_entry *entry) +{ + return (entry->ae_stat.aest_ctime); +} + +int +archive_entry_ctime_is_set(struct archive_entry *entry) +{ + return (entry->ae_set & AE_SET_CTIME); +} + +long +archive_entry_ctime_nsec(struct archive_entry *entry) +{ + return (entry->ae_stat.aest_ctime_nsec); +} + +dev_t +archive_entry_dev(struct archive_entry *entry) +{ + if (entry->ae_stat.aest_dev_is_broken_down) + return ae_makedev(entry->ae_stat.aest_devmajor, + entry->ae_stat.aest_devminor); + else + return (entry->ae_stat.aest_dev); +} + +dev_t +archive_entry_devmajor(struct archive_entry *entry) +{ + if (entry->ae_stat.aest_dev_is_broken_down) + return (entry->ae_stat.aest_devmajor); + else + return major(entry->ae_stat.aest_dev); +} + +dev_t +archive_entry_devminor(struct archive_entry *entry) +{ + if (entry->ae_stat.aest_dev_is_broken_down) + return (entry->ae_stat.aest_devminor); + else + return minor(entry->ae_stat.aest_dev); +} + +mode_t +archive_entry_filetype(struct archive_entry *entry) +{ + return (AE_IFMT & entry->ae_stat.aest_mode); +} + +void +archive_entry_fflags(struct archive_entry *entry, + unsigned long *set, unsigned long *clear) +{ + *set = entry->ae_fflags_set; + *clear = entry->ae_fflags_clear; +} + +/* + * Note: if text was provided, this just returns that text. If you + * really need the text to be rebuilt in a canonical form, set the + * text, ask for the bitmaps, then set the bitmaps. (Setting the + * bitmaps clears any stored text.) This design is deliberate: if + * we're editing archives, we don't want to discard flags just because + * they aren't supported on the current system. The bitmap<->text + * conversions are platform-specific (see below). + */ +const char * +archive_entry_fflags_text(struct archive_entry *entry) +{ + const char *f; + char *p; + + f = aes_get_mbs(&entry->ae_fflags_text); + if (f != NULL) + return (f); + + if (entry->ae_fflags_set == 0 && entry->ae_fflags_clear == 0) + return (NULL); + + p = ae_fflagstostr(entry->ae_fflags_set, entry->ae_fflags_clear); + if (p == NULL) + return (NULL); + + aes_copy_mbs(&entry->ae_fflags_text, p); + free(p); + f = aes_get_mbs(&entry->ae_fflags_text); + return (f); +} + +gid_t +archive_entry_gid(struct archive_entry *entry) +{ + return (entry->ae_stat.aest_gid); +} + +const char * +archive_entry_gname(struct archive_entry *entry) +{ + return (aes_get_mbs(&entry->ae_gname)); +} + +const wchar_t * +archive_entry_gname_w(struct archive_entry *entry) +{ + return (aes_get_wcs(&entry->ae_gname)); +} + +const char * +archive_entry_hardlink(struct archive_entry *entry) +{ + if (entry->ae_set & AE_SET_HARDLINK) + return (aes_get_mbs(&entry->ae_hardlink)); + return (NULL); +} + +const wchar_t * +archive_entry_hardlink_w(struct archive_entry *entry) +{ + if (entry->ae_set & AE_SET_HARDLINK) + return (aes_get_wcs(&entry->ae_hardlink)); + return (NULL); +} + +ino_t +archive_entry_ino(struct archive_entry *entry) +{ + return (entry->ae_stat.aest_ino); +} + +#ifndef __minix +int64_t +archive_entry_ino64(struct archive_entry *entry) +{ + return (entry->ae_stat.aest_ino); +} +#endif + + +mode_t +archive_entry_mode(struct archive_entry *entry) +{ + return (entry->ae_stat.aest_mode); +} + +time_t +archive_entry_mtime(struct archive_entry *entry) +{ + return (entry->ae_stat.aest_mtime); +} + +long +archive_entry_mtime_nsec(struct archive_entry *entry) +{ + return (entry->ae_stat.aest_mtime_nsec); +} + +int +archive_entry_mtime_is_set(struct archive_entry *entry) +{ + return (entry->ae_set & AE_SET_MTIME); +} + +unsigned int +archive_entry_nlink(struct archive_entry *entry) +{ + return (entry->ae_stat.aest_nlink); +} + +const char * +archive_entry_pathname(struct archive_entry *entry) +{ + return (aes_get_mbs(&entry->ae_pathname)); +} + +const wchar_t * +archive_entry_pathname_w(struct archive_entry *entry) +{ + return (aes_get_wcs(&entry->ae_pathname)); +} + +dev_t +archive_entry_rdev(struct archive_entry *entry) +{ + if (entry->ae_stat.aest_rdev_is_broken_down) + return ae_makedev(entry->ae_stat.aest_rdevmajor, + entry->ae_stat.aest_rdevminor); + else + return (entry->ae_stat.aest_rdev); +} + +dev_t +archive_entry_rdevmajor(struct archive_entry *entry) +{ + if (entry->ae_stat.aest_rdev_is_broken_down) + return (entry->ae_stat.aest_rdevmajor); + else + return major(entry->ae_stat.aest_rdev); +} + +dev_t +archive_entry_rdevminor(struct archive_entry *entry) +{ + if (entry->ae_stat.aest_rdev_is_broken_down) + return (entry->ae_stat.aest_rdevminor); + else + return minor(entry->ae_stat.aest_rdev); +} + +#ifndef __minix +int64_t +archive_entry_size(struct archive_entry *entry) +{ + return (entry->ae_stat.aest_size); +} +#else +ssize_t +archive_entry_size(struct archive_entry *entry) +{ + return (entry->ae_stat.aest_size); +} +#endif + +int +archive_entry_size_is_set(struct archive_entry *entry) +{ + return (entry->ae_set & AE_SET_SIZE); +} + +const char * +archive_entry_sourcepath(struct archive_entry *entry) +{ + return (aes_get_mbs(&entry->ae_sourcepath)); +} + +const char * +archive_entry_symlink(struct archive_entry *entry) +{ + if (entry->ae_set & AE_SET_SYMLINK) + return (aes_get_mbs(&entry->ae_symlink)); + return (NULL); +} + +const wchar_t * +archive_entry_symlink_w(struct archive_entry *entry) +{ + if (entry->ae_set & AE_SET_SYMLINK) + return (aes_get_wcs(&entry->ae_symlink)); + return (NULL); +} + +uid_t +archive_entry_uid(struct archive_entry *entry) +{ + return (entry->ae_stat.aest_uid); +} + +const char * +archive_entry_uname(struct archive_entry *entry) +{ + return (aes_get_mbs(&entry->ae_uname)); +} + +const wchar_t * +archive_entry_uname_w(struct archive_entry *entry) +{ + return (aes_get_wcs(&entry->ae_uname)); +} + +/* + * Functions to set archive_entry properties. + */ + +void +archive_entry_set_filetype(struct archive_entry *entry, unsigned int type) +{ + entry->stat_valid = 0; + entry->ae_stat.aest_mode &= ~AE_IFMT; + entry->ae_stat.aest_mode |= AE_IFMT & type; +} + +void +archive_entry_set_fflags(struct archive_entry *entry, + unsigned long set, unsigned long clear) +{ + aes_clean(&entry->ae_fflags_text); + entry->ae_fflags_set = set; + entry->ae_fflags_clear = clear; +} + +const char * +archive_entry_copy_fflags_text(struct archive_entry *entry, + const char *flags) +{ + aes_copy_mbs(&entry->ae_fflags_text, flags); + return (ae_strtofflags(flags, + &entry->ae_fflags_set, &entry->ae_fflags_clear)); +} + +const wchar_t * +archive_entry_copy_fflags_text_w(struct archive_entry *entry, + const wchar_t *flags) +{ + aes_copy_wcs(&entry->ae_fflags_text, flags); + return (ae_wcstofflags(flags, + &entry->ae_fflags_set, &entry->ae_fflags_clear)); +} + +void +archive_entry_set_gid(struct archive_entry *entry, gid_t g) +{ + entry->stat_valid = 0; + entry->ae_stat.aest_gid = g; +} + +void +archive_entry_set_gname(struct archive_entry *entry, const char *name) +{ + aes_set_mbs(&entry->ae_gname, name); +} + +void +archive_entry_copy_gname(struct archive_entry *entry, const char *name) +{ + aes_copy_mbs(&entry->ae_gname, name); +} + +void +archive_entry_copy_gname_w(struct archive_entry *entry, const wchar_t *name) +{ + aes_copy_wcs(&entry->ae_gname, name); +} + +int +archive_entry_update_gname_utf8(struct archive_entry *entry, const char *name) +{ + return (aes_update_utf8(&entry->ae_gname, name)); +} + +void +archive_entry_set_ino(struct archive_entry *entry, unsigned long ino) +{ + entry->stat_valid = 0; + entry->ae_stat.aest_ino = ino; +} + +#ifndef __minix +void +archive_entry_set_ino64(struct archive_entry *entry, int64_t ino) +{ + entry->stat_valid = 0; + entry->ae_stat.aest_ino = ino; +} +#endif + +void +archive_entry_set_hardlink(struct archive_entry *entry, const char *target) +{ + aes_set_mbs(&entry->ae_hardlink, target); + if (target != NULL) + entry->ae_set |= AE_SET_HARDLINK; + else + entry->ae_set &= ~AE_SET_HARDLINK; +} + +void +archive_entry_copy_hardlink(struct archive_entry *entry, const char *target) +{ + aes_copy_mbs(&entry->ae_hardlink, target); + if (target != NULL) + entry->ae_set |= AE_SET_HARDLINK; + else + entry->ae_set &= ~AE_SET_HARDLINK; +} + +void +archive_entry_copy_hardlink_w(struct archive_entry *entry, const wchar_t *target) +{ + aes_copy_wcs(&entry->ae_hardlink, target); + if (target != NULL) + entry->ae_set |= AE_SET_HARDLINK; + else + entry->ae_set &= ~AE_SET_HARDLINK; +} + +int +archive_entry_update_hardlink_utf8(struct archive_entry *entry, const char *target) +{ + if (target != NULL) + entry->ae_set |= AE_SET_HARDLINK; + else + entry->ae_set &= ~AE_SET_HARDLINK; + return (aes_update_utf8(&entry->ae_hardlink, target)); +} + +void +archive_entry_set_atime(struct archive_entry *entry, time_t t, long ns) +{ + entry->stat_valid = 0; + entry->ae_set |= AE_SET_ATIME; + entry->ae_stat.aest_atime = t; + entry->ae_stat.aest_atime_nsec = ns; +} + +void +archive_entry_unset_atime(struct archive_entry *entry) +{ + archive_entry_set_atime(entry, 0, 0); + entry->ae_set &= ~AE_SET_ATIME; +} + +void +archive_entry_set_birthtime(struct archive_entry *entry, time_t m, long ns) +{ + entry->stat_valid = 0; + entry->ae_set |= AE_SET_BIRTHTIME; + entry->ae_stat.aest_birthtime = m; + entry->ae_stat.aest_birthtime_nsec = ns; +} + +void +archive_entry_unset_birthtime(struct archive_entry *entry) +{ + archive_entry_set_birthtime(entry, 0, 0); + entry->ae_set &= ~AE_SET_BIRTHTIME; +} + +void +archive_entry_set_ctime(struct archive_entry *entry, time_t t, long ns) +{ + entry->stat_valid = 0; + entry->ae_set |= AE_SET_CTIME; + entry->ae_stat.aest_ctime = t; + entry->ae_stat.aest_ctime_nsec = ns; +} + +void +archive_entry_unset_ctime(struct archive_entry *entry) +{ + archive_entry_set_ctime(entry, 0, 0); + entry->ae_set &= ~AE_SET_CTIME; +} + +void +archive_entry_set_dev(struct archive_entry *entry, dev_t d) +{ + entry->stat_valid = 0; + entry->ae_stat.aest_dev_is_broken_down = 0; + entry->ae_stat.aest_dev = d; +} + +void +archive_entry_set_devmajor(struct archive_entry *entry, dev_t m) +{ + entry->stat_valid = 0; + entry->ae_stat.aest_dev_is_broken_down = 1; + entry->ae_stat.aest_devmajor = m; +} + +void +archive_entry_set_devminor(struct archive_entry *entry, dev_t m) +{ + entry->stat_valid = 0; + entry->ae_stat.aest_dev_is_broken_down = 1; + entry->ae_stat.aest_devminor = m; +} + +/* Set symlink if symlink is already set, else set hardlink. */ +void +archive_entry_set_link(struct archive_entry *entry, const char *target) +{ + if (entry->ae_set & AE_SET_SYMLINK) + aes_set_mbs(&entry->ae_symlink, target); + else + aes_set_mbs(&entry->ae_hardlink, target); +} + +/* Set symlink if symlink is already set, else set hardlink. */ +void +archive_entry_copy_link(struct archive_entry *entry, const char *target) +{ + if (entry->ae_set & AE_SET_SYMLINK) + aes_copy_mbs(&entry->ae_symlink, target); + else + aes_copy_mbs(&entry->ae_hardlink, target); +} + +/* Set symlink if symlink is already set, else set hardlink. */ +void +archive_entry_copy_link_w(struct archive_entry *entry, const wchar_t *target) +{ + if (entry->ae_set & AE_SET_SYMLINK) + aes_copy_wcs(&entry->ae_symlink, target); + else + aes_copy_wcs(&entry->ae_hardlink, target); +} + +int +archive_entry_update_link_utf8(struct archive_entry *entry, const char *target) +{ + if (entry->ae_set & AE_SET_SYMLINK) + return (aes_update_utf8(&entry->ae_symlink, target)); + else + return (aes_update_utf8(&entry->ae_hardlink, target)); +} + +void +archive_entry_set_mode(struct archive_entry *entry, mode_t m) +{ + entry->stat_valid = 0; + entry->ae_stat.aest_mode = m; +} + +void +archive_entry_set_mtime(struct archive_entry *entry, time_t m, long ns) +{ + entry->stat_valid = 0; + entry->ae_set |= AE_SET_MTIME; + entry->ae_stat.aest_mtime = m; + entry->ae_stat.aest_mtime_nsec = ns; +} + +void +archive_entry_unset_mtime(struct archive_entry *entry) +{ + archive_entry_set_mtime(entry, 0, 0); + entry->ae_set &= ~AE_SET_MTIME; +} + +void +archive_entry_set_nlink(struct archive_entry *entry, unsigned int nlink) +{ + entry->stat_valid = 0; + entry->ae_stat.aest_nlink = nlink; +} + +void +archive_entry_set_pathname(struct archive_entry *entry, const char *name) +{ + aes_set_mbs(&entry->ae_pathname, name); +} + +void +archive_entry_copy_pathname(struct archive_entry *entry, const char *name) +{ + aes_copy_mbs(&entry->ae_pathname, name); +} + +void +archive_entry_copy_pathname_w(struct archive_entry *entry, const wchar_t *name) +{ + aes_copy_wcs(&entry->ae_pathname, name); +} + +int +archive_entry_update_pathname_utf8(struct archive_entry *entry, const char *name) +{ + return (aes_update_utf8(&entry->ae_pathname, name)); +} + +void +archive_entry_set_perm(struct archive_entry *entry, mode_t p) +{ + entry->stat_valid = 0; + entry->ae_stat.aest_mode &= AE_IFMT; + entry->ae_stat.aest_mode |= ~AE_IFMT & p; +} + +void +archive_entry_set_rdev(struct archive_entry *entry, dev_t m) +{ + entry->stat_valid = 0; + entry->ae_stat.aest_rdev = m; + entry->ae_stat.aest_rdev_is_broken_down = 0; +} + +void +archive_entry_set_rdevmajor(struct archive_entry *entry, dev_t m) +{ + entry->stat_valid = 0; + entry->ae_stat.aest_rdev_is_broken_down = 1; + entry->ae_stat.aest_rdevmajor = m; +} + +void +archive_entry_set_rdevminor(struct archive_entry *entry, dev_t m) +{ + entry->stat_valid = 0; + entry->ae_stat.aest_rdev_is_broken_down = 1; + entry->ae_stat.aest_rdevminor = m; +} + +#ifndef __minix +void +archive_entry_set_size(struct archive_entry *entry, int64_t s) +{ + entry->stat_valid = 0; + entry->ae_stat.aest_size = s; + entry->ae_set |= AE_SET_SIZE; +} +#else +void +archive_entry_set_size(struct archive_entry *entry, ssize_t s) +{ + entry->stat_valid = 0; + entry->ae_stat.aest_size = s; + entry->ae_set |= AE_SET_SIZE; +} +#endif + +void +archive_entry_unset_size(struct archive_entry *entry) +{ + archive_entry_set_size(entry, 0); + entry->ae_set &= ~AE_SET_SIZE; +} + +void +archive_entry_copy_sourcepath(struct archive_entry *entry, const char *path) +{ + aes_set_mbs(&entry->ae_sourcepath, path); +} + +void +archive_entry_set_symlink(struct archive_entry *entry, const char *linkname) +{ + aes_set_mbs(&entry->ae_symlink, linkname); + if (linkname != NULL) + entry->ae_set |= AE_SET_SYMLINK; + else + entry->ae_set &= ~AE_SET_SYMLINK; +} + +void +archive_entry_copy_symlink(struct archive_entry *entry, const char *linkname) +{ + aes_copy_mbs(&entry->ae_symlink, linkname); + if (linkname != NULL) + entry->ae_set |= AE_SET_SYMLINK; + else + entry->ae_set &= ~AE_SET_SYMLINK; +} + +void +archive_entry_copy_symlink_w(struct archive_entry *entry, const wchar_t *linkname) +{ + aes_copy_wcs(&entry->ae_symlink, linkname); + if (linkname != NULL) + entry->ae_set |= AE_SET_SYMLINK; + else + entry->ae_set &= ~AE_SET_SYMLINK; +} + +int +archive_entry_update_symlink_utf8(struct archive_entry *entry, const char *linkname) +{ + if (linkname != NULL) + entry->ae_set |= AE_SET_SYMLINK; + else + entry->ae_set &= ~AE_SET_SYMLINK; + return (aes_update_utf8(&entry->ae_symlink, linkname)); +} + +void +archive_entry_set_uid(struct archive_entry *entry, uid_t u) +{ + entry->stat_valid = 0; + entry->ae_stat.aest_uid = u; +} + +void +archive_entry_set_uname(struct archive_entry *entry, const char *name) +{ + aes_set_mbs(&entry->ae_uname, name); +} + +void +archive_entry_copy_uname(struct archive_entry *entry, const char *name) +{ + aes_copy_mbs(&entry->ae_uname, name); +} + +void +archive_entry_copy_uname_w(struct archive_entry *entry, const wchar_t *name) +{ + aes_copy_wcs(&entry->ae_uname, name); +} + +int +archive_entry_update_uname_utf8(struct archive_entry *entry, const char *name) +{ + return (aes_update_utf8(&entry->ae_uname, name)); +} + +/* + * ACL management. The following would, of course, be a lot simpler + * if: 1) the last draft of POSIX.1e were a really thorough and + * complete standard that addressed the needs of ACL archiving and 2) + * everyone followed it faithfully. Alas, neither is true, so the + * following is a lot more complex than might seem necessary to the + * uninitiated. + */ + +void +archive_entry_acl_clear(struct archive_entry *entry) +{ + struct ae_acl *ap; + + while (entry->acl_head != NULL) { + ap = entry->acl_head->next; + aes_clean(&entry->acl_head->name); + free(entry->acl_head); + entry->acl_head = ap; + } + if (entry->acl_text_w != NULL) { + free(entry->acl_text_w); + entry->acl_text_w = NULL; + } + entry->acl_p = NULL; + entry->acl_state = 0; /* Not counting. */ +} + +/* + * Add a single ACL entry to the internal list of ACL data. + */ +void +archive_entry_acl_add_entry(struct archive_entry *entry, + int type, int permset, int tag, int id, const char *name) +{ + struct ae_acl *ap; + + if (acl_special(entry, type, permset, tag) == 0) + return; + ap = acl_new_entry(entry, type, permset, tag, id); + if (ap == NULL) { + /* XXX Error XXX */ + return; + } + if (name != NULL && *name != '\0') + aes_copy_mbs(&ap->name, name); + else + aes_clean(&ap->name); +} + +/* + * As above, but with a wide-character name. + */ +void +archive_entry_acl_add_entry_w(struct archive_entry *entry, + int type, int permset, int tag, int id, const wchar_t *name) +{ + archive_entry_acl_add_entry_w_len(entry, type, permset, tag, id, name, wcslen(name)); +} + +static void +archive_entry_acl_add_entry_w_len(struct archive_entry *entry, + int type, int permset, int tag, int id, const wchar_t *name, size_t len) +{ + struct ae_acl *ap; + + if (acl_special(entry, type, permset, tag) == 0) + return; + ap = acl_new_entry(entry, type, permset, tag, id); + if (ap == NULL) { + /* XXX Error XXX */ + return; + } + if (name != NULL && *name != L'\0' && len > 0) + aes_copy_wcs_len(&ap->name, name, len); + else + aes_clean(&ap->name); +} + +/* + * If this ACL entry is part of the standard POSIX permissions set, + * store the permissions in the stat structure and return zero. + */ +static int +acl_special(struct archive_entry *entry, int type, int permset, int tag) +{ + if (type == ARCHIVE_ENTRY_ACL_TYPE_ACCESS) { + switch (tag) { + case ARCHIVE_ENTRY_ACL_USER_OBJ: + entry->ae_stat.aest_mode &= ~0700; + entry->ae_stat.aest_mode |= (permset & 7) << 6; + return (0); + case ARCHIVE_ENTRY_ACL_GROUP_OBJ: + entry->ae_stat.aest_mode &= ~0070; + entry->ae_stat.aest_mode |= (permset & 7) << 3; + return (0); + case ARCHIVE_ENTRY_ACL_OTHER: + entry->ae_stat.aest_mode &= ~0007; + entry->ae_stat.aest_mode |= permset & 7; + return (0); + } + } + return (1); +} + +/* + * Allocate and populate a new ACL entry with everything but the + * name. + */ +static struct ae_acl * +acl_new_entry(struct archive_entry *entry, + int type, int permset, int tag, int id) +{ + struct ae_acl *ap, *aq; + + if (type != ARCHIVE_ENTRY_ACL_TYPE_ACCESS && + type != ARCHIVE_ENTRY_ACL_TYPE_DEFAULT) + return (NULL); + if (entry->acl_text_w != NULL) { + free(entry->acl_text_w); + entry->acl_text_w = NULL; + } + + /* XXX TODO: More sanity-checks on the arguments XXX */ + + /* If there's a matching entry already in the list, overwrite it. */ + ap = entry->acl_head; + aq = NULL; + while (ap != NULL) { + if (ap->type == type && ap->tag == tag && ap->id == id) { + ap->permset = permset; + return (ap); + } + aq = ap; + ap = ap->next; + } + + /* Add a new entry to the end of the list. */ + ap = (struct ae_acl *)malloc(sizeof(*ap)); + if (ap == NULL) + return (NULL); + memset(ap, 0, sizeof(*ap)); + if (aq == NULL) + entry->acl_head = ap; + else + aq->next = ap; + ap->type = type; + ap->tag = tag; + ap->id = id; + ap->permset = permset; + return (ap); +} + +/* + * Return a count of entries matching "want_type". + */ +int +archive_entry_acl_count(struct archive_entry *entry, int want_type) +{ + int count; + struct ae_acl *ap; + + count = 0; + ap = entry->acl_head; + while (ap != NULL) { + if ((ap->type & want_type) != 0) + count++; + ap = ap->next; + } + + if (count > 0 && ((want_type & ARCHIVE_ENTRY_ACL_TYPE_ACCESS) != 0)) + count += 3; + return (count); +} + +/* + * Prepare for reading entries from the ACL data. Returns a count + * of entries matching "want_type", or zero if there are no + * non-extended ACL entries of that type. + */ +int +archive_entry_acl_reset(struct archive_entry *entry, int want_type) +{ + int count, cutoff; + + count = archive_entry_acl_count(entry, want_type); + + /* + * If the only entries are the three standard ones, + * then don't return any ACL data. (In this case, + * client can just use chmod(2) to set permissions.) + */ + if ((want_type & ARCHIVE_ENTRY_ACL_TYPE_ACCESS) != 0) + cutoff = 3; + else + cutoff = 0; + + if (count > cutoff) + entry->acl_state = ARCHIVE_ENTRY_ACL_USER_OBJ; + else + entry->acl_state = 0; + entry->acl_p = entry->acl_head; + return (count); +} + +/* + * Return the next ACL entry in the list. Fake entries for the + * standard permissions and include them in the returned list. + */ + +int +archive_entry_acl_next(struct archive_entry *entry, int want_type, int *type, + int *permset, int *tag, int *id, const char **name) +{ + *name = NULL; + *id = -1; + + /* + * The acl_state is either zero (no entries available), -1 + * (reading from list), or an entry type (retrieve that type + * from ae_stat.aest_mode). + */ + if (entry->acl_state == 0) + return (ARCHIVE_WARN); + + /* The first three access entries are special. */ + if ((want_type & ARCHIVE_ENTRY_ACL_TYPE_ACCESS) != 0) { + switch (entry->acl_state) { + case ARCHIVE_ENTRY_ACL_USER_OBJ: + *permset = (entry->ae_stat.aest_mode >> 6) & 7; + *type = ARCHIVE_ENTRY_ACL_TYPE_ACCESS; + *tag = ARCHIVE_ENTRY_ACL_USER_OBJ; + entry->acl_state = ARCHIVE_ENTRY_ACL_GROUP_OBJ; + return (ARCHIVE_OK); + case ARCHIVE_ENTRY_ACL_GROUP_OBJ: + *permset = (entry->ae_stat.aest_mode >> 3) & 7; + *type = ARCHIVE_ENTRY_ACL_TYPE_ACCESS; + *tag = ARCHIVE_ENTRY_ACL_GROUP_OBJ; + entry->acl_state = ARCHIVE_ENTRY_ACL_OTHER; + return (ARCHIVE_OK); + case ARCHIVE_ENTRY_ACL_OTHER: + *permset = entry->ae_stat.aest_mode & 7; + *type = ARCHIVE_ENTRY_ACL_TYPE_ACCESS; + *tag = ARCHIVE_ENTRY_ACL_OTHER; + entry->acl_state = -1; + entry->acl_p = entry->acl_head; + return (ARCHIVE_OK); + default: + break; + } + } + + while (entry->acl_p != NULL && (entry->acl_p->type & want_type) == 0) + entry->acl_p = entry->acl_p->next; + if (entry->acl_p == NULL) { + entry->acl_state = 0; + *type = 0; + *permset = 0; + *tag = 0; + *id = -1; + *name = NULL; + return (ARCHIVE_EOF); /* End of ACL entries. */ + } + *type = entry->acl_p->type; + *permset = entry->acl_p->permset; + *tag = entry->acl_p->tag; + *id = entry->acl_p->id; + *name = aes_get_mbs(&entry->acl_p->name); + entry->acl_p = entry->acl_p->next; + return (ARCHIVE_OK); +} + +/* + * Generate a text version of the ACL. The flags parameter controls + * the style of the generated ACL. + */ +const wchar_t * +archive_entry_acl_text_w(struct archive_entry *entry, int flags) +{ + int count; + size_t length; + const wchar_t *wname; + const wchar_t *prefix; + wchar_t separator; + struct ae_acl *ap; + int id; + wchar_t *wp; + + if (entry->acl_text_w != NULL) { + free (entry->acl_text_w); + entry->acl_text_w = NULL; + } + + separator = L','; + count = 0; + length = 0; + ap = entry->acl_head; + while (ap != NULL) { + if ((ap->type & flags) != 0) { + count++; + if ((flags & ARCHIVE_ENTRY_ACL_STYLE_MARK_DEFAULT) && + (ap->type & ARCHIVE_ENTRY_ACL_TYPE_DEFAULT)) + length += 8; /* "default:" */ + length += 5; /* tag name */ + length += 1; /* colon */ + wname = aes_get_wcs(&ap->name); + if (wname != NULL) + length += wcslen(wname); + else + length += sizeof(uid_t) * 3 + 1; + length ++; /* colon */ + length += 3; /* rwx */ + length += 1; /* colon */ + length += max(sizeof(uid_t), sizeof(gid_t)) * 3 + 1; + length ++; /* newline */ + } + ap = ap->next; + } + + if (count > 0 && ((flags & ARCHIVE_ENTRY_ACL_TYPE_ACCESS) != 0)) { + length += 10; /* "user::rwx\n" */ + length += 11; /* "group::rwx\n" */ + length += 11; /* "other::rwx\n" */ + } + + if (count == 0) + return (NULL); + + /* Now, allocate the string and actually populate it. */ + wp = entry->acl_text_w = (wchar_t *)malloc(length * sizeof(wchar_t)); + if (wp == NULL) + __archive_errx(1, "No memory to generate the text version of the ACL"); + count = 0; + if ((flags & ARCHIVE_ENTRY_ACL_TYPE_ACCESS) != 0) { + append_entry_w(&wp, NULL, ARCHIVE_ENTRY_ACL_USER_OBJ, NULL, + entry->ae_stat.aest_mode & 0700, -1); + *wp++ = ','; + append_entry_w(&wp, NULL, ARCHIVE_ENTRY_ACL_GROUP_OBJ, NULL, + entry->ae_stat.aest_mode & 0070, -1); + *wp++ = ','; + append_entry_w(&wp, NULL, ARCHIVE_ENTRY_ACL_OTHER, NULL, + entry->ae_stat.aest_mode & 0007, -1); + count += 3; + + ap = entry->acl_head; + while (ap != NULL) { + if ((ap->type & ARCHIVE_ENTRY_ACL_TYPE_ACCESS) != 0) { + wname = aes_get_wcs(&ap->name); + *wp++ = separator; + if (flags & ARCHIVE_ENTRY_ACL_STYLE_EXTRA_ID) + id = ap->id; + else + id = -1; + append_entry_w(&wp, NULL, ap->tag, wname, + ap->permset, id); + count++; + } + ap = ap->next; + } + } + + + if ((flags & ARCHIVE_ENTRY_ACL_TYPE_DEFAULT) != 0) { + if (flags & ARCHIVE_ENTRY_ACL_STYLE_MARK_DEFAULT) + prefix = L"default:"; + else + prefix = NULL; + ap = entry->acl_head; + count = 0; + while (ap != NULL) { + if ((ap->type & ARCHIVE_ENTRY_ACL_TYPE_DEFAULT) != 0) { + wname = aes_get_wcs(&ap->name); + if (count > 0) + *wp++ = separator; + if (flags & ARCHIVE_ENTRY_ACL_STYLE_EXTRA_ID) + id = ap->id; + else + id = -1; + append_entry_w(&wp, prefix, ap->tag, + wname, ap->permset, id); + count ++; + } + ap = ap->next; + } + } + + return (entry->acl_text_w); +} + +static void +append_id_w(wchar_t **wp, int id) +{ + if (id < 0) + id = 0; + if (id > 9) + append_id_w(wp, id / 10); + *(*wp)++ = L"0123456789"[id % 10]; +} + +static void +append_entry_w(wchar_t **wp, const wchar_t *prefix, int tag, + const wchar_t *wname, int perm, int id) +{ + if (prefix != NULL) { + wcscpy(*wp, prefix); + *wp += wcslen(*wp); + } + switch (tag) { + case ARCHIVE_ENTRY_ACL_USER_OBJ: + wname = NULL; + id = -1; + /* FALLTHROUGH */ + case ARCHIVE_ENTRY_ACL_USER: + wcscpy(*wp, L"user"); + break; + case ARCHIVE_ENTRY_ACL_GROUP_OBJ: + wname = NULL; + id = -1; + /* FALLTHROUGH */ + case ARCHIVE_ENTRY_ACL_GROUP: + wcscpy(*wp, L"group"); + break; + case ARCHIVE_ENTRY_ACL_MASK: + wcscpy(*wp, L"mask"); + wname = NULL; + id = -1; + break; + case ARCHIVE_ENTRY_ACL_OTHER: + wcscpy(*wp, L"other"); + wname = NULL; + id = -1; + break; + } + *wp += wcslen(*wp); + *(*wp)++ = L':'; + if (wname != NULL) { + wcscpy(*wp, wname); + *wp += wcslen(*wp); + } else if (tag == ARCHIVE_ENTRY_ACL_USER + || tag == ARCHIVE_ENTRY_ACL_GROUP) { + append_id_w(wp, id); + id = -1; + } + *(*wp)++ = L':'; + *(*wp)++ = (perm & 0444) ? L'r' : L'-'; + *(*wp)++ = (perm & 0222) ? L'w' : L'-'; + *(*wp)++ = (perm & 0111) ? L'x' : L'-'; + if (id != -1) { + *(*wp)++ = L':'; + append_id_w(wp, id); + } + **wp = L'\0'; +} + +/* + * Parse a textual ACL. This automatically recognizes and supports + * extensions described above. The 'type' argument is used to + * indicate the type that should be used for any entries not + * explicitly marked as "default:". + */ +int +__archive_entry_acl_parse_w(struct archive_entry *entry, + const wchar_t *text, int default_type) +{ + struct { + const wchar_t *start; + const wchar_t *end; + } field[4], name; + + int fields, n; + int type, tag, permset, id; + wchar_t sep; + + while (text != NULL && *text != L'\0') { + /* + * Parse the fields out of the next entry, + * advance 'text' to start of next entry. + */ + fields = 0; + do { + const wchar_t *start, *end; + next_field_w(&text, &start, &end, &sep); + if (fields < 4) { + field[fields].start = start; + field[fields].end = end; + } + ++fields; + } while (sep == L':'); + + /* Set remaining fields to blank. */ + for (n = fields; n < 4; ++n) + field[n].start = field[n].end = NULL; + + /* Check for a numeric ID in field 1 or 3. */ + id = -1; + isint_w(field[1].start, field[1].end, &id); + /* Field 3 is optional. */ + if (id == -1 && fields > 3) + isint_w(field[3].start, field[3].end, &id); + + /* + * Solaris extension: "defaultuser::rwx" is the + * default ACL corresponding to "user::rwx", etc. + */ + if (field[0].end - field[0].start > 7 + && wmemcmp(field[0].start, L"default", 7) == 0) { + type = ARCHIVE_ENTRY_ACL_TYPE_DEFAULT; + field[0].start += 7; + } else + type = default_type; + + name.start = name.end = NULL; + if (prefix_w(field[0].start, field[0].end, L"user")) { + if (!ismode_w(field[2].start, field[2].end, &permset)) + return (ARCHIVE_WARN); + if (id != -1 || field[1].start < field[1].end) { + tag = ARCHIVE_ENTRY_ACL_USER; + name = field[1]; + } else + tag = ARCHIVE_ENTRY_ACL_USER_OBJ; + } else if (prefix_w(field[0].start, field[0].end, L"group")) { + if (!ismode_w(field[2].start, field[2].end, &permset)) + return (ARCHIVE_WARN); + if (id != -1 || field[1].start < field[1].end) { + tag = ARCHIVE_ENTRY_ACL_GROUP; + name = field[1]; + } else + tag = ARCHIVE_ENTRY_ACL_GROUP_OBJ; + } else if (prefix_w(field[0].start, field[0].end, L"other")) { + if (fields == 2 + && field[1].start < field[1].end + && ismode_w(field[1].start, field[1].end, &permset)) { + /* This is Solaris-style "other:rwx" */ + } else if (fields == 3 + && field[1].start == field[1].end + && field[2].start < field[2].end + && ismode_w(field[2].start, field[2].end, &permset)) { + /* This is FreeBSD-style "other::rwx" */ + } else + return (ARCHIVE_WARN); + tag = ARCHIVE_ENTRY_ACL_OTHER; + } else if (prefix_w(field[0].start, field[0].end, L"mask")) { + if (fields == 2 + && field[1].start < field[1].end + && ismode_w(field[1].start, field[1].end, &permset)) { + /* This is Solaris-style "mask:rwx" */ + } else if (fields == 3 + && field[1].start == field[1].end + && field[2].start < field[2].end + && ismode_w(field[2].start, field[2].end, &permset)) { + /* This is FreeBSD-style "mask::rwx" */ + } else + return (ARCHIVE_WARN); + tag = ARCHIVE_ENTRY_ACL_MASK; + } else + return (ARCHIVE_WARN); + + /* Add entry to the internal list. */ + archive_entry_acl_add_entry_w_len(entry, type, permset, + tag, id, name.start, name.end - name.start); + } + return (ARCHIVE_OK); +} + +/* + * Parse a string to a positive decimal integer. Returns true if + * the string is non-empty and consists only of decimal digits, + * false otherwise. + */ +static int +isint_w(const wchar_t *start, const wchar_t *end, int *result) +{ + int n = 0; + if (start >= end) + return (0); + while (start < end) { + if (*start < '0' || *start > '9') + return (0); + if (n > (INT_MAX / 10)) + n = INT_MAX; + else { + n *= 10; + n += *start - '0'; + } + start++; + } + *result = n; + return (1); +} + +/* + * Parse a string as a mode field. Returns true if + * the string is non-empty and consists only of mode characters, + * false otherwise. + */ +static int +ismode_w(const wchar_t *start, const wchar_t *end, int *permset) +{ + const wchar_t *p; + + if (start >= end) + return (0); + p = start; + *permset = 0; + while (p < end) { + switch (*p++) { + case 'r': case 'R': + *permset |= ARCHIVE_ENTRY_ACL_READ; + break; + case 'w': case 'W': + *permset |= ARCHIVE_ENTRY_ACL_WRITE; + break; + case 'x': case 'X': + *permset |= ARCHIVE_ENTRY_ACL_EXECUTE; + break; + case '-': + break; + default: + return (0); + } + } + return (1); +} + +/* + * Match "[:whitespace:]*(.*)[:whitespace:]*[:,\n]". *wp is updated + * to point to just after the separator. *start points to the first + * character of the matched text and *end just after the last + * character of the matched identifier. In particular *end - *start + * is the length of the field body, not including leading or trailing + * whitespace. + */ +static void +next_field_w(const wchar_t **wp, const wchar_t **start, + const wchar_t **end, wchar_t *sep) +{ + /* Skip leading whitespace to find start of field. */ + while (**wp == L' ' || **wp == L'\t' || **wp == L'\n') { + (*wp)++; + } + *start = *wp; + + /* Scan for the separator. */ + while (**wp != L'\0' && **wp != L',' && **wp != L':' && + **wp != L'\n') { + (*wp)++; + } + *sep = **wp; + + /* Trim trailing whitespace to locate end of field. */ + *end = *wp - 1; + while (**end == L' ' || **end == L'\t' || **end == L'\n') { + (*end)--; + } + (*end)++; + + /* Adjust scanner location. */ + if (**wp != L'\0') + (*wp)++; +} + +/* + * Return true if the characters [start...end) are a prefix of 'test'. + * This makes it easy to handle the obvious abbreviations: 'u' for 'user', etc. + */ +static int +prefix_w(const wchar_t *start, const wchar_t *end, const wchar_t *test) +{ + if (start == end) + return (0); + + if (*start++ != *test++) + return (0); + + while (start < end && *start++ == *test++) + ; + + if (start < end) + return (0); + + return (1); +} + + +/* + * Following code is modified from UC Berkeley sources, and + * is subject to the following copyright notice. + */ + +/*- + * Copyright (c) 1993 + * The Regents of the University of California. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 4. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +static struct flag { + const char *name; + const wchar_t *wname; + unsigned long set; + unsigned long clear; +} flags[] = { + /* Preferred (shorter) names per flag first, all prefixed by "no" */ +#ifdef SF_APPEND + { "nosappnd", L"nosappnd", SF_APPEND, 0 }, + { "nosappend", L"nosappend", SF_APPEND, 0 }, +#endif +#ifdef EXT2_APPEND_FL /* 'a' */ + { "nosappnd", L"nosappnd", EXT2_APPEND_FL, 0 }, + { "nosappend", L"nosappend", EXT2_APPEND_FL, 0 }, +#endif +#ifdef SF_ARCHIVED + { "noarch", L"noarch", SF_ARCHIVED, 0 }, + { "noarchived", L"noarchived", SF_ARCHIVED, 0 }, +#endif +#ifdef SF_IMMUTABLE + { "noschg", L"noschg", SF_IMMUTABLE, 0 }, + { "noschange", L"noschange", SF_IMMUTABLE, 0 }, + { "nosimmutable", L"nosimmutable", SF_IMMUTABLE, 0 }, +#endif +#ifdef EXT2_IMMUTABLE_FL /* 'i' */ + { "noschg", L"noschg", EXT2_IMMUTABLE_FL, 0 }, + { "noschange", L"noschange", EXT2_IMMUTABLE_FL, 0 }, + { "nosimmutable", L"nosimmutable", EXT2_IMMUTABLE_FL, 0 }, +#endif +#ifdef SF_NOUNLINK + { "nosunlnk", L"nosunlnk", SF_NOUNLINK, 0 }, + { "nosunlink", L"nosunlink", SF_NOUNLINK, 0 }, +#endif +#ifdef SF_SNAPSHOT + { "nosnapshot", L"nosnapshot", SF_SNAPSHOT, 0 }, +#endif +#ifdef UF_APPEND + { "nouappnd", L"nouappnd", UF_APPEND, 0 }, + { "nouappend", L"nouappend", UF_APPEND, 0 }, +#endif +#ifdef UF_IMMUTABLE + { "nouchg", L"nouchg", UF_IMMUTABLE, 0 }, + { "nouchange", L"nouchange", UF_IMMUTABLE, 0 }, + { "nouimmutable", L"nouimmutable", UF_IMMUTABLE, 0 }, +#endif +#ifdef UF_NODUMP + { "nodump", L"nodump", 0, UF_NODUMP}, +#endif +#ifdef EXT2_NODUMP_FL /* 'd' */ + { "nodump", L"nodump", 0, EXT2_NODUMP_FL}, +#endif +#ifdef UF_OPAQUE + { "noopaque", L"noopaque", UF_OPAQUE, 0 }, +#endif +#ifdef UF_NOUNLINK + { "nouunlnk", L"nouunlnk", UF_NOUNLINK, 0 }, + { "nouunlink", L"nouunlink", UF_NOUNLINK, 0 }, +#endif +#ifdef EXT2_UNRM_FL + { "nouunlink", L"nouunlink", EXT2_UNRM_FL, 0}, +#endif + +#ifdef EXT2_BTREE_FL + { "nobtree", L"nobtree", EXT2_BTREE_FL, 0 }, +#endif + +#ifdef EXT2_ECOMPR_FL + { "nocomperr", L"nocomperr", EXT2_ECOMPR_FL, 0 }, +#endif + +#ifdef EXT2_COMPR_FL /* 'c' */ + { "nocompress", L"nocompress", EXT2_COMPR_FL, 0 }, +#endif + +#ifdef EXT2_NOATIME_FL /* 'A' */ + { "noatime", L"noatime", 0, EXT2_NOATIME_FL}, +#endif + +#ifdef EXT2_DIRTY_FL + { "nocompdirty",L"nocompdirty", EXT2_DIRTY_FL, 0}, +#endif + +#ifdef EXT2_COMPRBLK_FL +#ifdef EXT2_NOCOMPR_FL + { "nocomprblk", L"nocomprblk", EXT2_COMPRBLK_FL, EXT2_NOCOMPR_FL}, +#else + { "nocomprblk", L"nocomprblk", EXT2_COMPRBLK_FL, 0}, +#endif +#endif +#ifdef EXT2_DIRSYNC_FL + { "nodirsync", L"nodirsync", EXT2_DIRSYNC_FL, 0}, +#endif +#ifdef EXT2_INDEX_FL + { "nohashidx", L"nohashidx", EXT2_INDEX_FL, 0}, +#endif +#ifdef EXT2_IMAGIC_FL + { "noimagic", L"noimagic", EXT2_IMAGIC_FL, 0}, +#endif +#ifdef EXT3_JOURNAL_DATA_FL + { "nojournal", L"nojournal", EXT3_JOURNAL_DATA_FL, 0}, +#endif +#ifdef EXT2_SECRM_FL + { "nosecuredeletion",L"nosecuredeletion",EXT2_SECRM_FL, 0}, +#endif +#ifdef EXT2_SYNC_FL + { "nosync", L"nosync", EXT2_SYNC_FL, 0}, +#endif +#ifdef EXT2_NOTAIL_FL + { "notail", L"notail", 0, EXT2_NOTAIL_FL}, +#endif +#ifdef EXT2_TOPDIR_FL + { "notopdir", L"notopdir", EXT2_TOPDIR_FL, 0}, +#endif +#ifdef EXT2_RESERVED_FL + { "noreserved", L"noreserved", EXT2_RESERVED_FL, 0}, +#endif + + { NULL, NULL, 0, 0 } +}; + +/* + * fflagstostr -- + * Convert file flags to a comma-separated string. If no flags + * are set, return the empty string. + */ +static char * +ae_fflagstostr(unsigned long bitset, unsigned long bitclear) +{ + char *string, *dp; + const char *sp; + unsigned long bits; + struct flag *flag; + size_t length; + + bits = bitset | bitclear; + length = 0; + for (flag = flags; flag->name != NULL; flag++) + if (bits & (flag->set | flag->clear)) { + length += strlen(flag->name) + 1; + bits &= ~(flag->set | flag->clear); + } + + if (length == 0) + return (NULL); + string = (char *)malloc(length); + if (string == NULL) + return (NULL); + + dp = string; + for (flag = flags; flag->name != NULL; flag++) { + if (bitset & flag->set || bitclear & flag->clear) { + sp = flag->name + 2; + } else if (bitset & flag->clear || bitclear & flag->set) { + sp = flag->name; + } else + continue; + bitset &= ~(flag->set | flag->clear); + bitclear &= ~(flag->set | flag->clear); + if (dp > string) + *dp++ = ','; + while ((*dp++ = *sp++) != '\0') + ; + dp--; + } + + *dp = '\0'; + return (string); +} + +/* + * strtofflags -- + * Take string of arguments and return file flags. This + * version works a little differently than strtofflags(3). + * In particular, it always tests every token, skipping any + * unrecognized tokens. It returns a pointer to the first + * unrecognized token, or NULL if every token was recognized. + * This version is also const-correct and does not modify the + * provided string. + */ +static const char * +ae_strtofflags(const char *s, unsigned long *setp, unsigned long *clrp) +{ + const char *start, *end; + struct flag *flag; + unsigned long set, clear; + const char *failed; + + set = clear = 0; + start = s; + failed = NULL; + /* Find start of first token. */ + while (*start == '\t' || *start == ' ' || *start == ',') + start++; + while (*start != '\0') { + /* Locate end of token. */ + end = start; + while (*end != '\0' && *end != '\t' && + *end != ' ' && *end != ',') + end++; + for (flag = flags; flag->name != NULL; flag++) { + if (memcmp(start, flag->name, end - start) == 0) { + /* Matched "noXXXX", so reverse the sense. */ + clear |= flag->set; + set |= flag->clear; + break; + } else if (memcmp(start, flag->name + 2, end - start) + == 0) { + /* Matched "XXXX", so don't reverse. */ + set |= flag->set; + clear |= flag->clear; + break; + } + } + /* Ignore unknown flag names. */ + if (flag->name == NULL && failed == NULL) + failed = start; + + /* Find start of next token. */ + start = end; + while (*start == '\t' || *start == ' ' || *start == ',') + start++; + + } + + if (setp) + *setp = set; + if (clrp) + *clrp = clear; + + /* Return location of first failure. */ + return (failed); +} + +/* + * wcstofflags -- + * Take string of arguments and return file flags. This + * version works a little differently than strtofflags(3). + * In particular, it always tests every token, skipping any + * unrecognized tokens. It returns a pointer to the first + * unrecognized token, or NULL if every token was recognized. + * This version is also const-correct and does not modify the + * provided string. + */ +static const wchar_t * +ae_wcstofflags(const wchar_t *s, unsigned long *setp, unsigned long *clrp) +{ + const wchar_t *start, *end; + struct flag *flag; + unsigned long set, clear; + const wchar_t *failed; + + set = clear = 0; + start = s; + failed = NULL; + /* Find start of first token. */ + while (*start == L'\t' || *start == L' ' || *start == L',') + start++; + while (*start != L'\0') { + /* Locate end of token. */ + end = start; + while (*end != L'\0' && *end != L'\t' && + *end != L' ' && *end != L',') + end++; + for (flag = flags; flag->wname != NULL; flag++) { + if (wmemcmp(start, flag->wname, end - start) == 0) { + /* Matched "noXXXX", so reverse the sense. */ + clear |= flag->set; + set |= flag->clear; + break; + } else if (wmemcmp(start, flag->wname + 2, end - start) + == 0) { + /* Matched "XXXX", so don't reverse. */ + set |= flag->set; + clear |= flag->clear; + break; + } + } + /* Ignore unknown flag names. */ + if (flag->wname == NULL && failed == NULL) + failed = start; + + /* Find start of next token. */ + start = end; + while (*start == L'\t' || *start == L' ' || *start == L',') + start++; + + } + + if (setp) + *setp = set; + if (clrp) + *clrp = clear; + + /* Return location of first failure. */ + return (failed); +} + + +#ifdef TEST +#include +int +main(int argc, char **argv) +{ + struct archive_entry *entry = archive_entry_new(); + unsigned long set, clear; + const wchar_t *remainder; + + remainder = archive_entry_copy_fflags_text_w(entry, L"nosappnd dump archive,,,,,,,"); + archive_entry_fflags(entry, &set, &clear); + + wprintf(L"set=0x%lX clear=0x%lX remainder='%ls'\n", set, clear, remainder); + + wprintf(L"new flags='%s'\n", archive_entry_fflags_text(entry)); + return (0); +} +#endif diff --git a/lib/libarchive/archive_entry.h b/lib/libarchive/archive_entry.h new file mode 100644 index 000000000..12b224416 --- /dev/null +++ b/lib/libarchive/archive_entry.h @@ -0,0 +1,541 @@ +/*- + * Copyright (c) 2003-2008 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD: head/lib/libarchive/archive_entry.h 201096 2009-12-28 02:41:27Z kientzle $ + */ + +#ifndef ARCHIVE_ENTRY_H_INCLUDED +#define ARCHIVE_ENTRY_H_INCLUDED + +/* + * Note: archive_entry.h is for use outside of libarchive; the + * configuration headers (config.h, archive_platform.h, etc.) are + * purely internal. Do NOT use HAVE_XXX configuration macros to + * control the behavior of this header! If you must conditionalize, + * use predefined compiler and/or platform macros. + */ + +#include +#include /* for wchar_t */ +#include + +#if defined(_WIN32) && !defined(__CYGWIN__) +#include +#endif + +/* Get appropriate definitions of standard POSIX-style types. */ +/* These should match the types used in 'struct stat' */ +#if defined(_WIN32) && !defined(__CYGWIN__) +#define __LA_INT64_T __int64 +# if defined(__BORLANDC__) +# define __LA_UID_T uid_t +# define __LA_GID_T gid_t +# define __LA_DEV_T dev_t +# define __LA_MODE_T mode_t +# else +# define __LA_UID_T short +# define __LA_GID_T short +# define __LA_DEV_T unsigned int +# define __LA_MODE_T unsigned short +# endif +#elif defined(__minix) +#define __LA_UID_T uid_t +#define __LA_GID_T gid_t +#define __LA_DEV_T dev_t +#define __LA_MODE_T mode_t +#else +#include +#define __LA_INT64_T int64_t +#define __LA_UID_T uid_t +#define __LA_GID_T gid_t +#define __LA_DEV_T dev_t +#define __LA_MODE_T mode_t +#endif + +/* + * XXX Is this defined for all Windows compilers? If so, in what + * header? It would be nice to remove the __LA_INO_T indirection and + * just use plain ino_t everywhere. Likewise for the other types just + * above. + */ +#define __LA_INO_T ino_t + + +/* + * On Windows, define LIBARCHIVE_STATIC if you're building or using a + * .lib. The default here assumes you're building a DLL. Only + * libarchive source should ever define __LIBARCHIVE_BUILD. + */ +#if ((defined __WIN32__) || (defined _WIN32) || defined(__CYGWIN__)) && (!defined LIBARCHIVE_STATIC) +# ifdef __LIBARCHIVE_BUILD +# ifdef __GNUC__ +# define __LA_DECL __attribute__((dllexport)) extern +# else +# define __LA_DECL __declspec(dllexport) +# endif +# else +# ifdef __GNUC__ +# define __LA_DECL __attribute__((dllimport)) extern +# else +# define __LA_DECL __declspec(dllimport) +# endif +# endif +#else +/* Static libraries on all platforms and shared libraries on non-Windows. */ +# define __LA_DECL +#endif + +#ifdef __cplusplus +extern "C" { +#endif + +/* + * Description of an archive entry. + * + * You can think of this as "struct stat" with some text fields added in. + * + * TODO: Add "comment", "charset", and possibly other entries that are + * supported by "pax interchange" format. However, GNU, ustar, cpio, + * and other variants don't support these features, so they're not an + * excruciatingly high priority right now. + * + * TODO: "pax interchange" format allows essentially arbitrary + * key/value attributes to be attached to any entry. Supporting + * such extensions may make this library useful for special + * applications (e.g., a package manager could attach special + * package-management attributes to each entry). + */ +struct archive_entry; + +/* + * File-type constants. These are returned from archive_entry_filetype() + * and passed to archive_entry_set_filetype(). + * + * These values match S_XXX defines on every platform I've checked, + * including Windows, AIX, Linux, Solaris, and BSD. They're + * (re)defined here because platforms generally don't define the ones + * they don't support. For example, Windows doesn't define S_IFLNK or + * S_IFBLK. Instead of having a mass of conditional logic and system + * checks to define any S_XXX values that aren't supported locally, + * I've just defined a new set of such constants so that + * libarchive-based applications can manipulate and identify archive + * entries properly even if the hosting platform can't store them on + * disk. + * + * These values are also used directly within some portable formats, + * such as cpio. If you find a platform that varies from these, the + * correct solution is to leave these alone and translate from these + * portable values to platform-native values when entries are read from + * or written to disk. + */ +#define AE_IFMT 0170000 +#define AE_IFREG 0100000 +#define AE_IFLNK 0120000 +#define AE_IFSOCK 0140000 +#define AE_IFCHR 0020000 +#define AE_IFBLK 0060000 +#define AE_IFDIR 0040000 +#define AE_IFIFO 0010000 + +/* + * Basic object manipulation + */ + +__LA_DECL struct archive_entry *archive_entry_clear(struct archive_entry *); +/* The 'clone' function does a deep copy; all of the strings are copied too. */ +__LA_DECL struct archive_entry *archive_entry_clone(struct archive_entry *); +__LA_DECL void archive_entry_free(struct archive_entry *); +__LA_DECL struct archive_entry *archive_entry_new(void); + +/* + * Retrieve fields from an archive_entry. + * + * There are a number of implicit conversions among these fields. For + * example, if a regular string field is set and you read the _w wide + * character field, the entry will implicitly convert narrow-to-wide + * using the current locale. Similarly, dev values are automatically + * updated when you write devmajor or devminor and vice versa. + * + * In addition, fields can be "set" or "unset." Unset string fields + * return NULL, non-string fields have _is_set() functions to test + * whether they've been set. You can "unset" a string field by + * assigning NULL; non-string fields have _unset() functions to + * unset them. + * + * Note: There is one ambiguity in the above; string fields will + * also return NULL when implicit character set conversions fail. + * This is usually what you want. + */ +__LA_DECL time_t archive_entry_atime(struct archive_entry *); +__LA_DECL long archive_entry_atime_nsec(struct archive_entry *); +__LA_DECL int archive_entry_atime_is_set(struct archive_entry *); +__LA_DECL time_t archive_entry_birthtime(struct archive_entry *); +__LA_DECL long archive_entry_birthtime_nsec(struct archive_entry *); +__LA_DECL int archive_entry_birthtime_is_set(struct archive_entry *); +__LA_DECL time_t archive_entry_ctime(struct archive_entry *); +__LA_DECL long archive_entry_ctime_nsec(struct archive_entry *); +__LA_DECL int archive_entry_ctime_is_set(struct archive_entry *); +__LA_DECL dev_t archive_entry_dev(struct archive_entry *); +__LA_DECL dev_t archive_entry_devmajor(struct archive_entry *); +__LA_DECL dev_t archive_entry_devminor(struct archive_entry *); +__LA_DECL __LA_MODE_T archive_entry_filetype(struct archive_entry *); +__LA_DECL void archive_entry_fflags(struct archive_entry *, + unsigned long * /* set */, + unsigned long * /* clear */); +__LA_DECL const char *archive_entry_fflags_text(struct archive_entry *); +__LA_DECL __LA_GID_T archive_entry_gid(struct archive_entry *); +__LA_DECL const char *archive_entry_gname(struct archive_entry *); +__LA_DECL const wchar_t *archive_entry_gname_w(struct archive_entry *); +__LA_DECL const char *archive_entry_hardlink(struct archive_entry *); +__LA_DECL const wchar_t *archive_entry_hardlink_w(struct archive_entry *); +__LA_DECL __LA_INO_T archive_entry_ino(struct archive_entry *); +#ifndef __minix +__LA_DECL __LA_INT64_T archive_entry_ino64(struct archive_entry *); +#endif +__LA_DECL __LA_MODE_T archive_entry_mode(struct archive_entry *); +__LA_DECL time_t archive_entry_mtime(struct archive_entry *); +__LA_DECL long archive_entry_mtime_nsec(struct archive_entry *); +__LA_DECL int archive_entry_mtime_is_set(struct archive_entry *); +__LA_DECL unsigned int archive_entry_nlink(struct archive_entry *); +__LA_DECL const char *archive_entry_pathname(struct archive_entry *); +__LA_DECL const wchar_t *archive_entry_pathname_w(struct archive_entry *); +__LA_DECL dev_t archive_entry_rdev(struct archive_entry *); +__LA_DECL dev_t archive_entry_rdevmajor(struct archive_entry *); +__LA_DECL dev_t archive_entry_rdevminor(struct archive_entry *); +__LA_DECL const char *archive_entry_sourcepath(struct archive_entry *); +#ifndef __minix +__LA_DECL __LA_INT64_T archive_entry_size(struct archive_entry *); +#else +__LA_DECL ssize_t archive_entry_size(struct archive_entry *); +#endif +__LA_DECL int archive_entry_size_is_set(struct archive_entry *); +__LA_DECL const char *archive_entry_strmode(struct archive_entry *); +__LA_DECL const char *archive_entry_symlink(struct archive_entry *); +__LA_DECL const wchar_t *archive_entry_symlink_w(struct archive_entry *); +__LA_DECL __LA_UID_T archive_entry_uid(struct archive_entry *); +__LA_DECL const char *archive_entry_uname(struct archive_entry *); +__LA_DECL const wchar_t *archive_entry_uname_w(struct archive_entry *); + +/* + * Set fields in an archive_entry. + * + * Note that string 'set' functions do not copy the string, only the pointer. + * In contrast, 'copy' functions do copy the object pointed to. + * + * Note: As of libarchive 2.4, 'set' functions do copy the string and + * are therefore exact synonyms for the 'copy' versions. The 'copy' + * names will be retired in libarchive 3.0. + */ + +__LA_DECL void archive_entry_set_atime(struct archive_entry *, time_t, long); +__LA_DECL void archive_entry_unset_atime(struct archive_entry *); +#if defined(_WIN32) && !defined(__CYGWIN__) +__LA_DECL void archive_entry_copy_bhfi(struct archive_entry *, + BY_HANDLE_FILE_INFORMATION *); +#endif +__LA_DECL void archive_entry_set_birthtime(struct archive_entry *, time_t, long); +__LA_DECL void archive_entry_unset_birthtime(struct archive_entry *); +__LA_DECL void archive_entry_set_ctime(struct archive_entry *, time_t, long); +__LA_DECL void archive_entry_unset_ctime(struct archive_entry *); +__LA_DECL void archive_entry_set_dev(struct archive_entry *, dev_t); +__LA_DECL void archive_entry_set_devmajor(struct archive_entry *, dev_t); +__LA_DECL void archive_entry_set_devminor(struct archive_entry *, dev_t); +__LA_DECL void archive_entry_set_filetype(struct archive_entry *, unsigned int); +__LA_DECL void archive_entry_set_fflags(struct archive_entry *, + unsigned long /* set */, unsigned long /* clear */); +/* Returns pointer to start of first invalid token, or NULL if none. */ +/* Note that all recognized tokens are processed, regardless. */ +__LA_DECL const char *archive_entry_copy_fflags_text(struct archive_entry *, + const char *); +__LA_DECL const wchar_t *archive_entry_copy_fflags_text_w(struct archive_entry *, + const wchar_t *); +__LA_DECL void archive_entry_set_gid(struct archive_entry *, __LA_GID_T); +__LA_DECL void archive_entry_set_gname(struct archive_entry *, const char *); +__LA_DECL void archive_entry_copy_gname(struct archive_entry *, const char *); +__LA_DECL void archive_entry_copy_gname_w(struct archive_entry *, const wchar_t *); +__LA_DECL int archive_entry_update_gname_utf8(struct archive_entry *, const char *); +__LA_DECL void archive_entry_set_hardlink(struct archive_entry *, const char *); +__LA_DECL void archive_entry_copy_hardlink(struct archive_entry *, const char *); +__LA_DECL void archive_entry_copy_hardlink_w(struct archive_entry *, const wchar_t *); +__LA_DECL int archive_entry_update_hardlink_utf8(struct archive_entry *, const char *); +#if ARCHIVE_VERSION_NUMBER >= 3000000 +/* Starting with libarchive 3.0, this will be synonym for ino64. */ +__LA_DECL void archive_entry_set_ino(struct archive_entry *, __LA_INT64_T); +#else +__LA_DECL void archive_entry_set_ino(struct archive_entry *, unsigned long); +#endif +#ifndef __minix +__LA_DECL void archive_entry_set_ino64(struct archive_entry *, __LA_INT64_T); +#endif +__LA_DECL void archive_entry_set_link(struct archive_entry *, const char *); +__LA_DECL void archive_entry_copy_link(struct archive_entry *, const char *); +__LA_DECL void archive_entry_copy_link_w(struct archive_entry *, const wchar_t *); +__LA_DECL int archive_entry_update_link_utf8(struct archive_entry *, const char *); +__LA_DECL void archive_entry_set_mode(struct archive_entry *, __LA_MODE_T); +__LA_DECL void archive_entry_set_mtime(struct archive_entry *, time_t, long); +__LA_DECL void archive_entry_unset_mtime(struct archive_entry *); +__LA_DECL void archive_entry_set_nlink(struct archive_entry *, unsigned int); +__LA_DECL void archive_entry_set_pathname(struct archive_entry *, const char *); +__LA_DECL void archive_entry_copy_pathname(struct archive_entry *, const char *); +__LA_DECL void archive_entry_copy_pathname_w(struct archive_entry *, const wchar_t *); +__LA_DECL int archive_entry_update_pathname_utf8(struct archive_entry *, const char *); +__LA_DECL void archive_entry_set_perm(struct archive_entry *, __LA_MODE_T); +__LA_DECL void archive_entry_set_rdev(struct archive_entry *, dev_t); +__LA_DECL void archive_entry_set_rdevmajor(struct archive_entry *, dev_t); +__LA_DECL void archive_entry_set_rdevminor(struct archive_entry *, dev_t); +#ifndef __minix +__LA_DECL void archive_entry_set_size(struct archive_entry *, __LA_INT64_T); +#else +__LA_DECL void archive_entry_set_size(struct archive_entry *, ssize_t); +#endif +__LA_DECL void archive_entry_unset_size(struct archive_entry *); +__LA_DECL void archive_entry_copy_sourcepath(struct archive_entry *, const char *); +__LA_DECL void archive_entry_set_symlink(struct archive_entry *, const char *); +__LA_DECL void archive_entry_copy_symlink(struct archive_entry *, const char *); +__LA_DECL void archive_entry_copy_symlink_w(struct archive_entry *, const wchar_t *); +__LA_DECL int archive_entry_update_symlink_utf8(struct archive_entry *, const char *); +__LA_DECL void archive_entry_set_uid(struct archive_entry *, __LA_UID_T); +__LA_DECL void archive_entry_set_uname(struct archive_entry *, const char *); +__LA_DECL void archive_entry_copy_uname(struct archive_entry *, const char *); +__LA_DECL void archive_entry_copy_uname_w(struct archive_entry *, const wchar_t *); +__LA_DECL int archive_entry_update_uname_utf8(struct archive_entry *, const char *); +/* + * Routines to bulk copy fields to/from a platform-native "struct + * stat." Libarchive used to just store a struct stat inside of each + * archive_entry object, but this created issues when trying to + * manipulate archives on systems different than the ones they were + * created on. + * + * TODO: On Linux, provide both stat32 and stat64 versions of these functions. + */ +__LA_DECL const struct stat *archive_entry_stat(struct archive_entry *); +__LA_DECL void archive_entry_copy_stat(struct archive_entry *, const struct stat *); + + +/* + * ACL routines. This used to simply store and return text-format ACL + * strings, but that proved insufficient for a number of reasons: + * = clients need control over uname/uid and gname/gid mappings + * = there are many different ACL text formats + * = would like to be able to read/convert archives containing ACLs + * on platforms that lack ACL libraries + * + * This last point, in particular, forces me to implement a reasonably + * complete set of ACL support routines. + * + * TODO: Extend this to support NFSv4/NTFS permissions. That should + * allow full ACL support on Mac OS, in particular, which uses + * POSIX.1e-style interfaces to manipulate NFSv4/NTFS permissions. + */ + +/* + * Permission bits mimic POSIX.1e. Note that I've not followed POSIX.1e's + * "permset"/"perm" abstract type nonsense. A permset is just a simple + * bitmap, following long-standing Unix tradition. + */ +#define ARCHIVE_ENTRY_ACL_EXECUTE 1 +#define ARCHIVE_ENTRY_ACL_WRITE 2 +#define ARCHIVE_ENTRY_ACL_READ 4 + +/* We need to be able to specify either or both of these. */ +#define ARCHIVE_ENTRY_ACL_TYPE_ACCESS 256 +#define ARCHIVE_ENTRY_ACL_TYPE_DEFAULT 512 + +/* Tag values mimic POSIX.1e */ +#define ARCHIVE_ENTRY_ACL_USER 10001 /* Specified user. */ +#define ARCHIVE_ENTRY_ACL_USER_OBJ 10002 /* User who owns the file. */ +#define ARCHIVE_ENTRY_ACL_GROUP 10003 /* Specified group. */ +#define ARCHIVE_ENTRY_ACL_GROUP_OBJ 10004 /* Group who owns the file. */ +#define ARCHIVE_ENTRY_ACL_MASK 10005 /* Modify group access. */ +#define ARCHIVE_ENTRY_ACL_OTHER 10006 /* Public. */ + +/* + * Set the ACL by clearing it and adding entries one at a time. + * Unlike the POSIX.1e ACL routines, you must specify the type + * (access/default) for each entry. Internally, the ACL data is just + * a soup of entries. API calls here allow you to retrieve just the + * entries of interest. This design (which goes against the spirit of + * POSIX.1e) is useful for handling archive formats that combine + * default and access information in a single ACL list. + */ +__LA_DECL void archive_entry_acl_clear(struct archive_entry *); +__LA_DECL void archive_entry_acl_add_entry(struct archive_entry *, + int /* type */, int /* permset */, int /* tag */, + int /* qual */, const char * /* name */); +__LA_DECL void archive_entry_acl_add_entry_w(struct archive_entry *, + int /* type */, int /* permset */, int /* tag */, + int /* qual */, const wchar_t * /* name */); + +/* + * To retrieve the ACL, first "reset", then repeatedly ask for the + * "next" entry. The want_type parameter allows you to request only + * access entries or only default entries. + */ +__LA_DECL int archive_entry_acl_reset(struct archive_entry *, int /* want_type */); +__LA_DECL int archive_entry_acl_next(struct archive_entry *, int /* want_type */, + int * /* type */, int * /* permset */, int * /* tag */, + int * /* qual */, const char ** /* name */); +__LA_DECL int archive_entry_acl_next_w(struct archive_entry *, int /* want_type */, + int * /* type */, int * /* permset */, int * /* tag */, + int * /* qual */, const wchar_t ** /* name */); + +/* + * Construct a text-format ACL. The flags argument is a bitmask that + * can include any of the following: + * + * ARCHIVE_ENTRY_ACL_TYPE_ACCESS - Include access entries. + * ARCHIVE_ENTRY_ACL_TYPE_DEFAULT - Include default entries. + * ARCHIVE_ENTRY_ACL_STYLE_EXTRA_ID - Include extra numeric ID field in + * each ACL entry. (As used by 'star'.) + * ARCHIVE_ENTRY_ACL_STYLE_MARK_DEFAULT - Include "default:" before each + * default ACL entry. + */ +#define ARCHIVE_ENTRY_ACL_STYLE_EXTRA_ID 1024 +#define ARCHIVE_ENTRY_ACL_STYLE_MARK_DEFAULT 2048 +__LA_DECL const wchar_t *archive_entry_acl_text_w(struct archive_entry *, + int /* flags */); + +/* Return a count of entries matching 'want_type' */ +__LA_DECL int archive_entry_acl_count(struct archive_entry *, int /* want_type */); + +/* + * Private ACL parser. This is private because it handles some + * very weird formats that clients should not be messing with. + * Clients should only deal with their platform-native formats. + * Because of the need to support many formats cleanly, new arguments + * are likely to get added on a regular basis. Clients who try to use + * this interface are likely to be surprised when it changes. + * + * You were warned! + * + * TODO: Move this declaration out of the public header and into + * a private header. Warnings above are silly. + */ +__LA_DECL int __archive_entry_acl_parse_w(struct archive_entry *, + const wchar_t *, int /* type */); + +/* + * extended attributes + */ + +__LA_DECL void archive_entry_xattr_clear(struct archive_entry *); +__LA_DECL void archive_entry_xattr_add_entry(struct archive_entry *, + const char * /* name */, const void * /* value */, + size_t /* size */); + +/* + * To retrieve the xattr list, first "reset", then repeatedly ask for the + * "next" entry. + */ + +__LA_DECL int archive_entry_xattr_count(struct archive_entry *); +__LA_DECL int archive_entry_xattr_reset(struct archive_entry *); +__LA_DECL int archive_entry_xattr_next(struct archive_entry *, + const char ** /* name */, const void ** /* value */, size_t *); + +/* + * Utility to match up hardlinks. + * + * The 'struct archive_entry_linkresolver' is a cache of archive entries + * for files with multiple links. Here's how to use it: + * 1. Create a lookup object with archive_entry_linkresolver_new() + * 2. Tell it the archive format you're using. + * 3. Hand each archive_entry to archive_entry_linkify(). + * That function will return 0, 1, or 2 entries that should + * be written. + * 4. Call archive_entry_linkify(resolver, NULL) until + * no more entries are returned. + * 5. Call archive_entry_link_resolver_free(resolver) to free resources. + * + * The entries returned have their hardlink and size fields updated + * appropriately. If an entry is passed in that does not refer to + * a file with multiple links, it is returned unchanged. The intention + * is that you should be able to simply filter all entries through + * this machine. + * + * To make things more efficient, be sure that each entry has a valid + * nlinks value. The hardlink cache uses this to track when all links + * have been found. If the nlinks value is zero, it will keep every + * name in the cache indefinitely, which can use a lot of memory. + * + * Note that archive_entry_size() is reset to zero if the file + * body should not be written to the archive. Pay attention! + */ +struct archive_entry_linkresolver; + +/* + * There are three different strategies for marking hardlinks. + * The descriptions below name them after the best-known + * formats that rely on each strategy: + * + * "Old cpio" is the simplest, it always returns any entry unmodified. + * As far as I know, only cpio formats use this. Old cpio archives + * store every link with the full body; the onus is on the dearchiver + * to detect and properly link the files as they are restored. + * "tar" is also pretty simple; it caches a copy the first time it sees + * any link. Subsequent appearances are modified to be hardlink + * references to the first one without any body. Used by all tar + * formats, although the newest tar formats permit the "old cpio" strategy + * as well. This strategy is very simple for the dearchiver, + * and reasonably straightforward for the archiver. + * "new cpio" is trickier. It stores the body only with the last + * occurrence. The complication is that we might not + * see every link to a particular file in a single session, so + * there's no easy way to know when we've seen the last occurrence. + * The solution here is to queue one link until we see the next. + * At the end of the session, you can enumerate any remaining + * entries by calling archive_entry_linkify(NULL) and store those + * bodies. If you have a file with three links l1, l2, and l3, + * you'll get the following behavior if you see all three links: + * linkify(l1) => NULL (the resolver stores l1 internally) + * linkify(l2) => l1 (resolver stores l2, you write l1) + * linkify(l3) => l2, l3 (all links seen, you can write both). + * If you only see l1 and l2, you'll get this behavior: + * linkify(l1) => NULL + * linkify(l2) => l1 + * linkify(NULL) => l2 (at end, you retrieve remaining links) + * As the name suggests, this strategy is used by newer cpio variants. + * It's noticably more complex for the archiver, slightly more complex + * for the dearchiver than the tar strategy, but makes it straightforward + * to restore a file using any link by simply continuing to scan until + * you see a link that is stored with a body. In contrast, the tar + * strategy requires you to rescan the archive from the beginning to + * correctly extract an arbitrary link. + */ + +__LA_DECL struct archive_entry_linkresolver *archive_entry_linkresolver_new(void); +__LA_DECL void archive_entry_linkresolver_set_strategy( + struct archive_entry_linkresolver *, int /* format_code */); +__LA_DECL void archive_entry_linkresolver_free(struct archive_entry_linkresolver *); +__LA_DECL void archive_entry_linkify(struct archive_entry_linkresolver *, + struct archive_entry **, struct archive_entry **); + +#ifdef __cplusplus +} +#endif + +/* This is meaningless outside of this header. */ +#undef __LA_DECL + +#endif /* !ARCHIVE_ENTRY_H_INCLUDED */ diff --git a/lib/libarchive/archive_entry_copy_bhfi.c b/lib/libarchive/archive_entry_copy_bhfi.c new file mode 100644 index 000000000..8339032c5 --- /dev/null +++ b/lib/libarchive/archive_entry_copy_bhfi.c @@ -0,0 +1,74 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD$"); + +#include "archive_private.h" +#include "archive_entry.h" + +#if defined(_WIN32) && !defined(__CYGWIN__) + +#define EPOC_TIME ARCHIVE_LITERAL_ULL(116444736000000000) + +__inline static void +fileTimeToUtc(const FILETIME *filetime, time_t *time, long *ns) +{ + ULARGE_INTEGER utc; + + utc.HighPart = filetime->dwHighDateTime; + utc.LowPart = filetime->dwLowDateTime; + if (utc.QuadPart >= EPOC_TIME) { + utc.QuadPart -= EPOC_TIME; + *time = (time_t)(utc.QuadPart / 10000000); /* milli seconds base */ + *ns = (long)(utc.QuadPart % 10000000) * 100;/* nano seconds base */ + } else { + *time = 0; + *ns = 0; + } +} + +void +archive_entry_copy_bhfi(struct archive_entry *entry, + BY_HANDLE_FILE_INFORMATION *bhfi) +{ + time_t secs; + long nsecs; + + fileTimeToUtc(&bhfi->ftLastAccessTime, &secs, &nsecs); + archive_entry_set_atime(entry, secs, nsecs); + fileTimeToUtc(&bhfi->ftLastWriteTime, &secs, &nsecs); + archive_entry_set_mtime(entry, secs, nsecs); + fileTimeToUtc(&bhfi->ftCreationTime, &secs, &nsecs); + archive_entry_set_birthtime(entry, secs, nsecs); + archive_entry_set_dev(entry, bhfi->dwVolumeSerialNumber); + archive_entry_set_ino64(entry, (((int64_t)bhfi->nFileIndexHigh) << 32) + + bhfi->nFileIndexLow); + archive_entry_set_nlink(entry, bhfi->nNumberOfLinks); + archive_entry_set_size(entry, (((int64_t)bhfi->nFileSizeHigh) << 32) + + bhfi->nFileSizeLow); +// archive_entry_set_mode(entry, st->st_mode); +} +#endif diff --git a/lib/libarchive/archive_entry_copy_stat.c b/lib/libarchive/archive_entry_copy_stat.c new file mode 100644 index 000000000..ef59a5e78 --- /dev/null +++ b/lib/libarchive/archive_entry_copy_stat.c @@ -0,0 +1,77 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_entry_copy_stat.c 189466 2009-03-07 00:52:02Z kientzle $"); + +#ifdef HAVE_SYS_STAT_H +#include +#endif + +#include "archive_entry.h" + +void +archive_entry_copy_stat(struct archive_entry *entry, const struct stat *st) +{ +#if HAVE_STRUCT_STAT_ST_MTIMESPEC_TV_NSEC + archive_entry_set_atime(entry, st->st_atime, st->st_atimespec.tv_nsec); + archive_entry_set_ctime(entry, st->st_ctime, st->st_ctimespec.tv_nsec); + archive_entry_set_mtime(entry, st->st_mtime, st->st_mtimespec.tv_nsec); +#elif HAVE_STRUCT_STAT_ST_MTIM_TV_NSEC + archive_entry_set_atime(entry, st->st_atime, st->st_atim.tv_nsec); + archive_entry_set_ctime(entry, st->st_ctime, st->st_ctim.tv_nsec); + archive_entry_set_mtime(entry, st->st_mtime, st->st_mtim.tv_nsec); +#elif HAVE_STRUCT_STAT_ST_MTIME_N + archive_entry_set_atime(entry, st->st_atime, st->st_atime_n); + archive_entry_set_ctime(entry, st->st_ctime, st->st_ctime_n); + archive_entry_set_mtime(entry, st->st_mtime, st->st_mtime_n); +#elif HAVE_STRUCT_STAT_ST_UMTIME + archive_entry_set_atime(entry, st->st_atime, st->st_uatime * 1000); + archive_entry_set_ctime(entry, st->st_ctime, st->st_uctime * 1000); + archive_entry_set_mtime(entry, st->st_mtime, st->st_umtime * 1000); +#elif HAVE_STRUCT_STAT_ST_MTIME_USEC + archive_entry_set_atime(entry, st->st_atime, st->st_atime_usec * 1000); + archive_entry_set_ctime(entry, st->st_ctime, st->st_ctime_usec * 1000); + archive_entry_set_mtime(entry, st->st_mtime, st->st_mtime_usec * 1000); +#else + archive_entry_set_atime(entry, st->st_atime, 0); + archive_entry_set_ctime(entry, st->st_ctime, 0); + archive_entry_set_mtime(entry, st->st_mtime, 0); +#if HAVE_STRUCT_STAT_ST_BIRTHTIME + archive_entry_set_birthtime(entry, st->st_birthtime, 0); +#endif +#endif +#if HAVE_STRUCT_STAT_ST_BIRTHTIMESPEC_TV_NSEC + archive_entry_set_birthtime(entry, st->st_birthtime, st->st_birthtimespec.tv_nsec); +#endif + archive_entry_set_dev(entry, st->st_dev); + archive_entry_set_gid(entry, st->st_gid); + archive_entry_set_uid(entry, st->st_uid); + archive_entry_set_ino(entry, st->st_ino); + archive_entry_set_nlink(entry, st->st_nlink); + archive_entry_set_rdev(entry, st->st_rdev); + archive_entry_set_size(entry, st->st_size); + archive_entry_set_mode(entry, st->st_mode); +} diff --git a/lib/libarchive/archive_entry_link_resolver.c b/lib/libarchive/archive_entry_link_resolver.c new file mode 100644 index 000000000..5ab348c6c --- /dev/null +++ b/lib/libarchive/archive_entry_link_resolver.c @@ -0,0 +1,430 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_entry_link_resolver.c 201100 2009-12-28 03:05:31Z kientzle $"); + +#ifdef HAVE_SYS_STAT_H +#include +#endif +#ifdef HAVE_ERRNO_H +#include +#endif +#include +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif + +#include "archive.h" +#include "archive_entry.h" + +/* + * This is mostly a pretty straightforward hash table implementation. + * The only interesting bit is the different strategies used to + * match up links. These strategies match those used by various + * archiving formats: + * tar - content stored with first link, remainder refer back to it. + * This requires us to match each subsequent link up with the + * first appearance. + * cpio - Old cpio just stored body with each link, match-ups were + * implicit. This is trivial. + * new cpio - New cpio only stores body with last link, match-ups + * are implicit. This is actually quite tricky; see the notes + * below. + */ + +/* Users pass us a format code, we translate that into a strategy here. */ +#define ARCHIVE_ENTRY_LINKIFY_LIKE_TAR 0 +#define ARCHIVE_ENTRY_LINKIFY_LIKE_MTREE 1 +#ifndef __minix +#define ARCHIVE_ENTRY_LINKIFY_LIKE_OLD_CPIO 2 +#define ARCHIVE_ENTRY_LINKIFY_LIKE_NEW_CPIO 3 +#endif + +/* Initial size of link cache. */ +#define links_cache_initial_size 1024 + +struct links_entry { + struct links_entry *next; + struct links_entry *previous; + int links; /* # links not yet seen */ + int hash; + struct archive_entry *canonical; + struct archive_entry *entry; +}; + +struct archive_entry_linkresolver { + struct links_entry **buckets; + struct links_entry *spare; + unsigned long number_entries; + size_t number_buckets; + int strategy; +}; + +static struct links_entry *find_entry(struct archive_entry_linkresolver *, + struct archive_entry *); +static void grow_hash(struct archive_entry_linkresolver *); +static struct links_entry *insert_entry(struct archive_entry_linkresolver *, + struct archive_entry *); +static struct links_entry *next_entry(struct archive_entry_linkresolver *); + +struct archive_entry_linkresolver * +archive_entry_linkresolver_new(void) +{ + struct archive_entry_linkresolver *res; + size_t i; + + res = malloc(sizeof(struct archive_entry_linkresolver)); + if (res == NULL) + return (NULL); + memset(res, 0, sizeof(struct archive_entry_linkresolver)); + res->number_buckets = links_cache_initial_size; + res->buckets = malloc(res->number_buckets * + sizeof(res->buckets[0])); + if (res->buckets == NULL) { + free(res); + return (NULL); + } + for (i = 0; i < res->number_buckets; i++) + res->buckets[i] = NULL; + return (res); +} + +void +archive_entry_linkresolver_set_strategy(struct archive_entry_linkresolver *res, + int fmt) +{ + int fmtbase = fmt & ARCHIVE_FORMAT_BASE_MASK; + + switch (fmtbase) { +#ifndef __minix + case ARCHIVE_FORMAT_CPIO: + switch (fmt) { + case ARCHIVE_FORMAT_CPIO_SVR4_NOCRC: + case ARCHIVE_FORMAT_CPIO_SVR4_CRC: + res->strategy = ARCHIVE_ENTRY_LINKIFY_LIKE_NEW_CPIO; + break; + default: + res->strategy = ARCHIVE_ENTRY_LINKIFY_LIKE_OLD_CPIO; + break; + } + break; +#endif + case ARCHIVE_FORMAT_MTREE: + res->strategy = ARCHIVE_ENTRY_LINKIFY_LIKE_MTREE; + break; + case ARCHIVE_FORMAT_TAR: + res->strategy = ARCHIVE_ENTRY_LINKIFY_LIKE_TAR; + break; + default: + res->strategy = ARCHIVE_ENTRY_LINKIFY_LIKE_TAR; + break; + } +} + +void +archive_entry_linkresolver_free(struct archive_entry_linkresolver *res) +{ + struct links_entry *le; + + if (res == NULL) + return; + + if (res->buckets != NULL) { + while ((le = next_entry(res)) != NULL) + archive_entry_free(le->entry); + free(res->buckets); + res->buckets = NULL; + } + free(res); +} + +void +archive_entry_linkify(struct archive_entry_linkresolver *res, + struct archive_entry **e, struct archive_entry **f) +{ + struct links_entry *le; + struct archive_entry *t; + + *f = NULL; /* Default: Don't return a second entry. */ + + if (*e == NULL) { + le = next_entry(res); + if (le != NULL) { + *e = le->entry; + le->entry = NULL; + } + return; + } + + /* If it has only one link, then we're done. */ + if (archive_entry_nlink(*e) == 1) + return; + /* Directories, devices never have hardlinks. */ + if (archive_entry_filetype(*e) == AE_IFDIR + || archive_entry_filetype(*e) == AE_IFBLK + || archive_entry_filetype(*e) == AE_IFCHR) + return; + + switch (res->strategy) { + case ARCHIVE_ENTRY_LINKIFY_LIKE_TAR: + le = find_entry(res, *e); + if (le != NULL) { + archive_entry_unset_size(*e); + archive_entry_copy_hardlink(*e, + archive_entry_pathname(le->canonical)); + } else + insert_entry(res, *e); + return; + case ARCHIVE_ENTRY_LINKIFY_LIKE_MTREE: + le = find_entry(res, *e); + if (le != NULL) { + archive_entry_copy_hardlink(*e, + archive_entry_pathname(le->canonical)); + } else + insert_entry(res, *e); + return; +#ifndef __minix + case ARCHIVE_ENTRY_LINKIFY_LIKE_OLD_CPIO: + /* This one is trivial. */ + return; + case ARCHIVE_ENTRY_LINKIFY_LIKE_NEW_CPIO: + le = find_entry(res, *e); + if (le != NULL) { + /* + * Put the new entry in le, return the + * old entry from le. + */ + t = *e; + *e = le->entry; + le->entry = t; + /* Make the old entry into a hardlink. */ + archive_entry_unset_size(*e); + archive_entry_copy_hardlink(*e, + archive_entry_pathname(le->canonical)); + /* If we ran out of links, return the + * final entry as well. */ + if (le->links == 0) { + *f = le->entry; + le->entry = NULL; + } + } else { + /* + * If we haven't seen it, tuck it away + * for future use. + */ + le = insert_entry(res, *e); + le->entry = *e; + *e = NULL; + } + return; +#endif + default: + break; + } + return; +} + +static struct links_entry * +find_entry(struct archive_entry_linkresolver *res, + struct archive_entry *entry) +{ + struct links_entry *le; + int hash, bucket; + dev_t dev; +#ifndef __minix + int64_t ino; +#else + int32_t ino; +#endif + + /* Free a held entry. */ + if (res->spare != NULL) { + archive_entry_free(res->spare->canonical); + archive_entry_free(res->spare->entry); + free(res->spare); + res->spare = NULL; + } + + /* If the links cache overflowed and got flushed, don't bother. */ + if (res->buckets == NULL) + return (NULL); + + dev = archive_entry_dev(entry); +#ifndef __minix + ino = archive_entry_ino64(entry); +#else + ino = archive_entry_ino(entry); +#endif + hash = (int)(dev ^ ino); + + /* Try to locate this entry in the links cache. */ + bucket = hash % res->number_buckets; + for (le = res->buckets[bucket]; le != NULL; le = le->next) { +#ifndef __minix + if (le->hash == hash + && dev == archive_entry_dev(le->canonical) + && ino == archive_entry_ino64(le->canonical)) { +#else + if (le->hash == hash + && dev == archive_entry_dev(le->canonical) + && ino == archive_entry_ino(le->canonical)) { +#endif + /* + * Decrement link count each time and release + * the entry if it hits zero. This saves + * memory and is necessary for detecting + * missed links. + */ + --le->links; + if (le->links > 0) + return (le); + /* Remove it from this hash bucket. */ + if (le->previous != NULL) + le->previous->next = le->next; + if (le->next != NULL) + le->next->previous = le->previous; + if (res->buckets[bucket] == le) + res->buckets[bucket] = le->next; + res->number_entries--; + /* Defer freeing this entry. */ + res->spare = le; + return (le); + } + } + return (NULL); +} + + +static struct links_entry * +next_entry(struct archive_entry_linkresolver *res) +{ + struct links_entry *le; + size_t bucket; + + /* Free a held entry. */ + if (res->spare != NULL) { + archive_entry_free(res->spare->canonical); + free(res->spare); + res->spare = NULL; + } + + /* If the links cache overflowed and got flushed, don't bother. */ + if (res->buckets == NULL) + return (NULL); + + /* Look for next non-empty bucket in the links cache. */ + for (bucket = 0; bucket < res->number_buckets; bucket++) { + le = res->buckets[bucket]; + if (le != NULL) { + /* Remove it from this hash bucket. */ + if (le->next != NULL) + le->next->previous = le->previous; + res->buckets[bucket] = le->next; + res->number_entries--; + /* Defer freeing this entry. */ + res->spare = le; + return (le); + } + } + return (NULL); +} + +static struct links_entry * +insert_entry(struct archive_entry_linkresolver *res, + struct archive_entry *entry) +{ + struct links_entry *le; + int hash, bucket; + + /* Add this entry to the links cache. */ + le = malloc(sizeof(struct links_entry)); + if (le == NULL) + return (NULL); + memset(le, 0, sizeof(*le)); + le->canonical = archive_entry_clone(entry); + + /* If the links cache is getting too full, enlarge the hash table. */ + if (res->number_entries > res->number_buckets * 2) + grow_hash(res); + +#ifndef __minix + hash = archive_entry_dev(entry) ^ archive_entry_ino64(entry); +#else + hash = ((int)archive_entry_dev(entry)) ^ ((int)archive_entry_ino(entry)); +#endif + bucket = hash % res->number_buckets; + + /* If we could allocate the entry, record it. */ + if (res->buckets[bucket] != NULL) + res->buckets[bucket]->previous = le; + res->number_entries++; + le->next = res->buckets[bucket]; + le->previous = NULL; + res->buckets[bucket] = le; + le->hash = hash; + le->links = archive_entry_nlink(entry) - 1; + return (le); +} + +static void +grow_hash(struct archive_entry_linkresolver *res) +{ + struct links_entry *le, **new_buckets; + size_t new_size; + size_t i, bucket; + + /* Try to enlarge the bucket list. */ + new_size = res->number_buckets * 2; + new_buckets = malloc(new_size * sizeof(struct links_entry *)); + + if (new_buckets != NULL) { + memset(new_buckets, 0, + new_size * sizeof(struct links_entry *)); + for (i = 0; i < res->number_buckets; i++) { + while (res->buckets[i] != NULL) { + /* Remove entry from old bucket. */ + le = res->buckets[i]; + res->buckets[i] = le->next; + + /* Add entry to new bucket. */ + bucket = le->hash % new_size; + + if (new_buckets[bucket] != NULL) + new_buckets[bucket]->previous = + le; + le->next = new_buckets[bucket]; + le->previous = NULL; + new_buckets[bucket] = le; + } + } + free(res->buckets); + res->buckets = new_buckets; + res->number_buckets = new_size; + } +} diff --git a/lib/libarchive/archive_entry_private.h b/lib/libarchive/archive_entry_private.h new file mode 100644 index 000000000..75df4a5e8 --- /dev/null +++ b/lib/libarchive/archive_entry_private.h @@ -0,0 +1,208 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD: head/lib/libarchive/archive_entry_private.h 201096 2009-12-28 02:41:27Z kientzle $ + */ + +#ifndef __LIBARCHIVE_BUILD +#error This header is only to be used internally to libarchive. +#endif + +#ifndef ARCHIVE_ENTRY_PRIVATE_H_INCLUDED +#define ARCHIVE_ENTRY_PRIVATE_H_INCLUDED + +#include "archive_string.h" + +/* + * Handle wide character (i.e., Unicode) and non-wide character + * strings transparently. + */ + +struct aes { + struct archive_string aes_mbs; + struct archive_string aes_utf8; + const wchar_t *aes_wcs; + /* Bitmap of which of the above are valid. Because we're lazy + * about malloc-ing and reusing the underlying storage, we + * can't rely on NULL pointers to indicate whether a string + * has been set. */ + int aes_set; +#define AES_SET_MBS 1 +#define AES_SET_UTF8 2 +#define AES_SET_WCS 4 +}; + +struct ae_acl { + struct ae_acl *next; + int type; /* E.g., access or default */ + int tag; /* E.g., user/group/other/mask */ + int permset; /* r/w/x bits */ + int id; /* uid/gid for user/group */ + struct aes name; /* uname/gname */ +}; + +struct ae_xattr { + struct ae_xattr *next; + + char *name; + void *value; + size_t size; +}; + +/* + * Description of an archive entry. + * + * Basically, this is a "struct stat" with a few text fields added in. + * + * TODO: Add "comment", "charset", and possibly other entries + * that are supported by "pax interchange" format. However, GNU, ustar, + * cpio, and other variants don't support these features, so they're not an + * excruciatingly high priority right now. + * + * TODO: "pax interchange" format allows essentially arbitrary + * key/value attributes to be attached to any entry. Supporting + * such extensions may make this library useful for special + * applications (e.g., a package manager could attach special + * package-management attributes to each entry). There are tricky + * API issues involved, so this is not going to happen until + * there's a real demand for it. + * + * TODO: Design a good API for handling sparse files. + */ +struct archive_entry { + /* + * Note that ae_stat.st_mode & AE_IFMT can be 0! + * + * This occurs when the actual file type of the object is not + * in the archive. For example, 'tar' archives store + * hardlinks without marking the type of the underlying + * object. + */ + + /* + * Read archive_entry_copy_stat.c for an explanation of why I + * don't just use "struct stat" instead of "struct aest" here + * and why I have this odd pointer to a separately-allocated + * struct stat. + */ + void *stat; + int stat_valid; /* Set to 0 whenever a field in aest changes. */ + + struct aest { +#ifndef __minix + uint64_t aest_atime; +#else + time_t aest_atime; +#endif + uint32_t aest_atime_nsec; +#ifndef __minix + int64_t aest_ctime; +#else + time_t aest_ctime; +#endif + uint32_t aest_ctime_nsec; +#ifndef __minix + int64_t aest_mtime; +#else + time_t aest_mtime; +#endif + uint32_t aest_mtime_nsec; +#ifndef __minix + int64_t aest_birthtime; +#else + time_t aest_birthtime; +#endif + uint32_t aest_birthtime_nsec; + gid_t aest_gid; +#ifndef __minix + int64_t aest_ino; +#else + ino_t aest_ino; +#endif + mode_t aest_mode; + uint32_t aest_nlink; +#ifndef __minix + uint64_t aest_size; +#else + size_t aest_size; +#endif + uid_t aest_uid; + /* + * Because converting between device codes and + * major/minor values is platform-specific and + * inherently a bit risky, we only do that conversion + * lazily. That way, we will do a better job of + * preserving information in those cases where no + * conversion is actually required. + */ + int aest_dev_is_broken_down; + dev_t aest_dev; + dev_t aest_devmajor; + dev_t aest_devminor; + int aest_rdev_is_broken_down; + dev_t aest_rdev; + dev_t aest_rdevmajor; + dev_t aest_rdevminor; + } ae_stat; + + int ae_set; /* bitmap of fields that are currently set */ +#define AE_SET_HARDLINK 1 +#define AE_SET_SYMLINK 2 +#define AE_SET_ATIME 4 +#define AE_SET_CTIME 8 +#define AE_SET_MTIME 16 +#define AE_SET_BIRTHTIME 32 +#define AE_SET_SIZE 64 + + /* + * Use aes here so that we get transparent mbs<->wcs conversions. + */ + struct aes ae_fflags_text; /* Text fflags per fflagstostr(3) */ + unsigned long ae_fflags_set; /* Bitmap fflags */ + unsigned long ae_fflags_clear; + struct aes ae_gname; /* Name of owning group */ + struct aes ae_hardlink; /* Name of target for hardlink */ + struct aes ae_pathname; /* Name of entry */ + struct aes ae_symlink; /* symlink contents */ + struct aes ae_uname; /* Name of owner */ + + /* Not used within libarchive; useful for some clients. */ + struct aes ae_sourcepath; /* Path this entry is sourced from. */ + + /* ACL support. */ + struct ae_acl *acl_head; + struct ae_acl *acl_p; + int acl_state; /* See acl_next for details. */ + wchar_t *acl_text_w; + + /* extattr support. */ + struct ae_xattr *xattr_head; + struct ae_xattr *xattr_p; + + /* Miscellaneous. */ + char strmode[12]; +}; + + +#endif /* ARCHIVE_ENTRY_PRIVATE_H_INCLUDED */ diff --git a/lib/libarchive/archive_entry_stat.c b/lib/libarchive/archive_entry_stat.c new file mode 100644 index 000000000..13371566d --- /dev/null +++ b/lib/libarchive/archive_entry_stat.c @@ -0,0 +1,123 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_entry_stat.c 201100 2009-12-28 03:05:31Z kientzle $"); + +#ifdef HAVE_SYS_STAT_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif + +#include "archive_entry.h" +#include "archive_entry_private.h" + +const struct stat * +archive_entry_stat(struct archive_entry *entry) +{ + struct stat *st; + if (entry->stat == NULL) { + entry->stat = malloc(sizeof(*st)); + if (entry->stat == NULL) + return (NULL); + entry->stat_valid = 0; + } + + /* + * If none of the underlying fields have been changed, we + * don't need to regenerate. In theory, we could use a bitmap + * here to flag only those items that have changed, but the + * extra complexity probably isn't worth it. It will be very + * rare for anyone to change just one field then request a new + * stat structure. + */ + if (entry->stat_valid) + return (entry->stat); + + st = entry->stat; + /* + * Use the public interfaces to extract items, so that + * the appropriate conversions get invoked. + */ + st->st_atime = archive_entry_atime(entry); +#if HAVE_STRUCT_STAT_ST_BIRTHTIME + st->st_birthtime = archive_entry_birthtime(entry); +#endif + st->st_ctime = archive_entry_ctime(entry); + st->st_mtime = archive_entry_mtime(entry); + st->st_dev = archive_entry_dev(entry); + st->st_gid = archive_entry_gid(entry); + st->st_uid = archive_entry_uid(entry); +#ifndef __minix + st->st_ino = archive_entry_ino64(entry); +#else + st->st_ino = archive_entry_ino(entry); +#endif + + st->st_nlink = archive_entry_nlink(entry); + st->st_rdev = archive_entry_rdev(entry); + st->st_size = archive_entry_size(entry); + st->st_mode = archive_entry_mode(entry); + + /* + * On systems that support high-res timestamps, copy that + * information into struct stat. + */ +#if HAVE_STRUCT_STAT_ST_MTIMESPEC_TV_NSEC + st->st_atimespec.tv_nsec = archive_entry_atime_nsec(entry); + st->st_ctimespec.tv_nsec = archive_entry_ctime_nsec(entry); + st->st_mtimespec.tv_nsec = archive_entry_mtime_nsec(entry); +#elif HAVE_STRUCT_STAT_ST_MTIM_TV_NSEC + st->st_atim.tv_nsec = archive_entry_atime_nsec(entry); + st->st_ctim.tv_nsec = archive_entry_ctime_nsec(entry); + st->st_mtim.tv_nsec = archive_entry_mtime_nsec(entry); +#elif HAVE_STRUCT_STAT_ST_MTIME_N + st->st_atime_n = archive_entry_atime_nsec(entry); + st->st_ctime_n = archive_entry_ctime_nsec(entry); + st->st_mtime_n = archive_entry_mtime_nsec(entry); +#elif HAVE_STRUCT_STAT_ST_UMTIME + st->st_uatime = archive_entry_atime_nsec(entry) / 1000; + st->st_uctime = archive_entry_ctime_nsec(entry) / 1000; + st->st_umtime = archive_entry_mtime_nsec(entry) / 1000; +#elif HAVE_STRUCT_STAT_ST_MTIME_USEC + st->st_atime_usec = archive_entry_atime_nsec(entry) / 1000; + st->st_ctime_usec = archive_entry_ctime_nsec(entry) / 1000; + st->st_mtime_usec = archive_entry_mtime_nsec(entry) / 1000; +#endif +#if HAVE_STRUCT_STAT_ST_BIRTHTIMESPEC_TV_NSEC + st->st_birthtimespec.tv_nsec = archive_entry_birthtime_nsec(entry); +#endif + + /* + * TODO: On Linux, store 32 or 64 here depending on whether + * the cached stat structure is a stat32 or a stat64. This + * will allow us to support both variants interchangably. + */ + entry->stat_valid = 1; + + return (st); +} diff --git a/lib/libarchive/archive_entry_strmode.c b/lib/libarchive/archive_entry_strmode.c new file mode 100644 index 000000000..16cb3f7bb --- /dev/null +++ b/lib/libarchive/archive_entry_strmode.c @@ -0,0 +1,87 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: src/lib/libarchive/archive_entry_strmode.c,v 1.4 2008/06/15 05:14:01 kientzle Exp $"); + +#ifdef HAVE_SYS_STAT_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif + +#include "archive_entry.h" +#include "archive_entry_private.h" + +const char * +archive_entry_strmode(struct archive_entry *entry) +{ + static const mode_t permbits[] = + { 0400, 0200, 0100, 0040, 0020, 0010, 0004, 0002, 0001 }; + char *bp = entry->strmode; + mode_t mode; + int i; + + /* Fill in a default string, then selectively override. */ + strcpy(bp, "?rwxrwxrwx "); + + mode = archive_entry_mode(entry); + switch (archive_entry_filetype(entry)) { + case AE_IFREG: bp[0] = '-'; break; + case AE_IFBLK: bp[0] = 'b'; break; + case AE_IFCHR: bp[0] = 'c'; break; + case AE_IFDIR: bp[0] = 'd'; break; + case AE_IFLNK: bp[0] = 'l'; break; + case AE_IFSOCK: bp[0] = 's'; break; + case AE_IFIFO: bp[0] = 'p'; break; + default: + if (archive_entry_hardlink(entry) != NULL) { + bp[0] = 'h'; + break; + } + } + + for (i = 0; i < 9; i++) + if (!(mode & permbits[i])) + bp[i+1] = '-'; + + if (mode & S_ISUID) { + if (mode & 0100) bp[3] = 's'; + else bp[3] = 'S'; + } + if (mode & S_ISGID) { + if (mode & 0010) bp[6] = 's'; + else bp[6] = 'S'; + } + if (mode & S_ISVTX) { + if (mode & 0001) bp[9] = 't'; + else bp[9] = 'T'; + } + if (archive_entry_acl_count(entry, ARCHIVE_ENTRY_ACL_TYPE_ACCESS)) + bp[10] = '+'; + + return (bp); +} diff --git a/lib/libarchive/archive_entry_xattr.c b/lib/libarchive/archive_entry_xattr.c new file mode 100644 index 000000000..a3efe7ca8 --- /dev/null +++ b/lib/libarchive/archive_entry_xattr.c @@ -0,0 +1,158 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_entry_xattr.c 201096 2009-12-28 02:41:27Z kientzle $"); + +#ifdef HAVE_SYS_STAT_H +#include +#endif +#ifdef HAVE_SYS_TYPES_H +#include +#endif +#ifdef HAVE_LIMITS_H +#include +#endif +#ifdef HAVE_LINUX_FS_H +#include /* for Linux file flags */ +#endif +/* + * Some Linux distributions have both linux/ext2_fs.h and ext2fs/ext2_fs.h. + * As the include guards don't agree, the order of include is important. + */ +#ifdef HAVE_LINUX_EXT2_FS_H +#include /* for Linux file flags */ +#endif +#if defined(HAVE_EXT2FS_EXT2_FS_H) && !defined(__CYGWIN__) +#include /* for Linux file flags */ +#endif +#include +#include +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#ifdef HAVE_WCHAR_H +#include +#endif + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" +#include "archive_entry_private.h" + +/* + * extended attribute handling + */ + +void +archive_entry_xattr_clear(struct archive_entry *entry) +{ + struct ae_xattr *xp; + + while (entry->xattr_head != NULL) { + xp = entry->xattr_head->next; + free(entry->xattr_head->name); + free(entry->xattr_head->value); + free(entry->xattr_head); + entry->xattr_head = xp; + } + + entry->xattr_head = NULL; +} + +void +archive_entry_xattr_add_entry(struct archive_entry *entry, + const char *name, const void *value, size_t size) +{ + struct ae_xattr *xp; + + for (xp = entry->xattr_head; xp != NULL; xp = xp->next) + ; + + if ((xp = (struct ae_xattr *)malloc(sizeof(struct ae_xattr))) == NULL) + /* XXX Error XXX */ + return; + + xp->name = strdup(name); + if ((xp->value = malloc(size)) != NULL) { + memcpy(xp->value, value, size); + xp->size = size; + } else + xp->size = 0; + + xp->next = entry->xattr_head; + entry->xattr_head = xp; +} + + +/* + * returns number of the extended attribute entries + */ +int +archive_entry_xattr_count(struct archive_entry *entry) +{ + struct ae_xattr *xp; + int count = 0; + + for (xp = entry->xattr_head; xp != NULL; xp = xp->next) + count++; + + return count; +} + +int +archive_entry_xattr_reset(struct archive_entry * entry) +{ + entry->xattr_p = entry->xattr_head; + + return archive_entry_xattr_count(entry); +} + +int +archive_entry_xattr_next(struct archive_entry * entry, + const char **name, const void **value, size_t *size) +{ + if (entry->xattr_p) { + *name = entry->xattr_p->name; + *value = entry->xattr_p->value; + *size = entry->xattr_p->size; + + entry->xattr_p = entry->xattr_p->next; + + return (ARCHIVE_OK); + } else { + *name = NULL; + *value = NULL; + *size = (size_t)0; + return (ARCHIVE_WARN); + } +} + +/* + * end of xattr handling + */ diff --git a/lib/libarchive/archive_hash.h b/lib/libarchive/archive_hash.h new file mode 100644 index 000000000..1a3b3344d --- /dev/null +++ b/lib/libarchive/archive_hash.h @@ -0,0 +1,196 @@ +/*- + * Copyright (c) 2009 Joerg Sonnenberger + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD: head/lib/libarchive/archive_hash.h 201171 2009-12-29 06:39:07Z kientzle $ + */ + +#ifndef __LIBARCHIVE_BUILD +#error This header is only to be used internally to libarchive. +#endif + +/* + * Hash function support in various Operating Systems: + * + * NetBSD: + * - MD5 and SHA1 in libc: without _ after algorithm name + * - SHA2 in libc: with _ after algorithm name + * + * OpenBSD: + * - MD5, SHA1 and SHA2 in libc: without _ after algorithm name + * - OpenBSD 4.4 and earlier have SHA2 in libc with _ after algorithm name + * + * DragonFly and FreeBSD (XXX not used yet): + * - MD5 and SHA1 in libmd: without _ after algorithm name + * - SHA256: with _ after algorithm name + * + * OpenSSL: + * - MD5, SHA1 and SHA2 in libcrypto: with _ after algorithm name + */ + +#if defined(HAVE_MD5_H) && defined(HAVE_MD5INIT) +# include +# define ARCHIVE_HAS_MD5 +typedef MD5_CTX archive_md5_ctx; +# define archive_md5_init(ctx) MD5Init(ctx) +# define archive_md5_final(ctx, buf) MD5Final(buf, ctx) +# define archive_md5_update(ctx, buf, n) MD5Update(ctx, buf, n) +#elif defined(HAVE_OPENSSL_MD5_H) +# include +# define ARCHIVE_HAS_MD5 +typedef MD5_CTX archive_md5_ctx; +# define archive_md5_init(ctx) MD5_Init(ctx) +# define archive_md5_final(ctx, buf) MD5_Final(buf, ctx) +# define archive_md5_update(ctx, buf, n) MD5_Update(ctx, buf, n) +#elif defined(_WIN32) && !defined(__CYGWIN__) && defined(CALG_MD5) +# define ARCHIVE_HAS_MD5 +typedef MD5_CTX archive_md5_ctx; +# define archive_md5_init(ctx) MD5_Init(ctx) +# define archive_md5_final(ctx, buf) MD5_Final(buf, ctx) +# define archive_md5_update(ctx, buf, n) MD5_Update(ctx, buf, n) +#endif + +#if defined(HAVE_RMD160_H) && defined(HAVE_RMD160INIT) +# include +# define ARCHIVE_HAS_RMD160 +typedef RMD160_CTX archive_rmd160_ctx; +# define archive_rmd160_init(ctx) RMD160Init(ctx) +# define archive_rmd160_final(ctx, buf) RMD160Final(buf, ctx) +# define archive_rmd160_update(ctx, buf, n) RMD160Update(ctx, buf, n) +#elif defined(HAVE_OPENSSL_RIPEMD_H) +# include +# define ARCHIVE_HAS_RMD160 +typedef RIPEMD160_CTX archive_rmd160_ctx; +# define archive_rmd160_init(ctx) RIPEMD160_Init(ctx) +# define archive_rmd160_final(ctx, buf) RIPEMD160_Final(buf, ctx) +# define archive_rmd160_update(ctx, buf, n) RIPEMD160_Update(ctx, buf, n) +#endif + +#if defined(HAVE_SHA1_H) && defined(HAVE_SHA1INIT) +# include +# define ARCHIVE_HAS_SHA1 +typedef SHA1_CTX archive_sha1_ctx; +# define archive_sha1_init(ctx) SHA1Init(ctx) +# define archive_sha1_final(ctx, buf) SHA1Final(buf, ctx) +# define archive_sha1_update(ctx, buf, n) SHA1Update(ctx, buf, n) +#elif defined(HAVE_OPENSSL_SHA_H) +# include +# define ARCHIVE_HAS_SHA1 +typedef SHA_CTX archive_sha1_ctx; +# define archive_sha1_init(ctx) SHA1_Init(ctx) +# define archive_sha1_final(ctx, buf) SHA1_Final(buf, ctx) +# define archive_sha1_update(ctx, buf, n) SHA1_Update(ctx, buf, n) +#elif defined(_WIN32) && !defined(__CYGWIN__) && defined(CALG_SHA1) +# define ARCHIVE_HAS_SHA1 +typedef SHA1_CTX archive_sha1_ctx; +# define archive_sha1_init(ctx) SHA1_Init(ctx) +# define archive_sha1_final(ctx, buf) SHA1_Final(buf, ctx) +# define archive_sha1_update(ctx, buf, n) SHA1_Update(ctx, buf, n) +#endif + +#if defined(HAVE_SHA2_H) && defined(HAVE_SHA256_INIT) +# include +# define ARCHIVE_HAS_SHA256 +typedef SHA256_CTX archive_sha256_ctx; +# define archive_sha256_init(ctx) SHA256_Init(ctx) +# define archive_sha256_final(ctx, buf) SHA256_Final(buf, ctx) +# define archive_sha256_update(ctx, buf, n) SHA256_Update(ctx, buf, n) +#elif defined(HAVE_SHA2_H) && defined(HAVE_SHA256INIT) +# include +# define ARCHIVE_HAS_SHA256 +typedef SHA256_CTX archive_sha256_ctx; +# define archive_sha256_init(ctx) SHA256Init(ctx) +# define archive_sha256_final(ctx, buf) SHA256Final(buf, ctx) +# define archive_sha256_update(ctx, buf, n) SHA256Update(ctx, buf, n) +#elif defined(HAVE_OPENSSL_SHA_H) && defined(HAVE_OPENSSL_SHA256_INIT) +# include +# define ARCHIVE_HAS_SHA256 +typedef SHA256_CTX archive_sha256_ctx; +# define archive_sha256_init(ctx) SHA256_Init(ctx) +# define archive_sha256_final(ctx, buf) SHA256_Final(buf, ctx) +# define archive_sha256_update(ctx, buf, n) SHA256_Update(ctx, buf, n) +#elif defined(_WIN32) && !defined(__CYGWIN__) && defined(CALG_SHA_256) +# define ARCHIVE_HAS_SHA256 +typedef SHA256_CTX archive_sha256_ctx; +# define archive_sha256_init(ctx) SHA256_Init(ctx) +# define archive_sha256_final(ctx, buf) SHA256_Final(buf, ctx) +# define archive_sha256_update(ctx, buf, n) SHA256_Update(ctx, buf, n) +#endif + +#if defined(HAVE_SHA2_H) && defined(HAVE_SHA384_INIT) +# include +# define ARCHIVE_HAS_SHA384 +typedef SHA384_CTX archive_sha384_ctx; +# define archive_sha384_init(ctx) SHA384_Init(ctx) +# define archive_sha384_final(ctx, buf) SHA384_Final(buf, ctx) +# define archive_sha384_update(ctx, buf, n) SHA384_Update(ctx, buf, n) +#elif defined(HAVE_SHA2_H) && defined(HAVE_SHA384INIT) +# include +# define ARCHIVE_HAS_SHA384 +typedef SHA384_CTX archive_sha384_ctx; +# define archive_sha384_init(ctx) SHA384Init(ctx) +# define archive_sha384_final(ctx, buf) SHA384Final(buf, ctx) +# define archive_sha384_update(ctx, buf, n) SHA384Update(ctx, buf, n) +#elif defined(HAVE_OPENSSL_SHA_H) && defined(HAVE_OPENSSL_SHA384_INIT) +# include +# define ARCHIVE_HAS_SHA384 +typedef SHA512_CTX archive_sha384_ctx; +# define archive_sha384_init(ctx) SHA384_Init(ctx) +# define archive_sha384_final(ctx, buf) SHA384_Final(buf, ctx) +# define archive_sha384_update(ctx, buf, n) SHA384_Update(ctx, buf, n) +#elif defined(_WIN32) && !defined(__CYGWIN__) && defined(CALG_SHA_384) +# define ARCHIVE_HAS_SHA384 +typedef SHA512_CTX archive_sha384_ctx; +# define archive_sha384_init(ctx) SHA384_Init(ctx) +# define archive_sha384_final(ctx, buf) SHA384_Final(buf, ctx) +# define archive_sha384_update(ctx, buf, n) SHA384_Update(ctx, buf, n) +#endif + +#if defined(HAVE_SHA2_H) && defined(HAVE_SHA512_INIT) +# include +# define ARCHIVE_HAS_SHA512 +typedef SHA512_CTX archive_sha512_ctx; +# define archive_sha512_init(ctx) SHA512_Init(ctx) +# define archive_sha512_final(ctx, buf) SHA512_Final(buf, ctx) +# define archive_sha512_update(ctx, buf, n) SHA512_Update(ctx, buf, n) +#elif defined(HAVE_SHA2_H) && defined(HAVE_SHA512INIT) +# include +# define ARCHIVE_HAS_SHA512 +typedef SHA512_CTX archive_sha512_ctx; +# define archive_sha512_init(ctx) SHA512Init(ctx) +# define archive_sha512_final(ctx, buf) SHA512Final(buf, ctx) +# define archive_sha512_update(ctx, buf, n) SHA512Update(ctx, buf, n) +#elif defined(HAVE_OPENSSL_SHA_H) && defined(HAVE_OPENSSL_SHA512_INIT) +# include +# define ARCHIVE_HAS_SHA512 +typedef SHA512_CTX archive_sha512_ctx; +# define archive_sha512_init(ctx) SHA512_Init(ctx) +# define archive_sha512_final(ctx, buf) SHA512_Final(buf, ctx) +# define archive_sha512_update(ctx, buf, n) SHA512_Update(ctx, buf, n) +#elif defined(_WIN32) && !defined(__CYGWIN__) && defined(CALG_SHA_512) +# define ARCHIVE_HAS_SHA512 +typedef SHA512_CTX archive_sha512_ctx; +# define archive_sha512_init(ctx) SHA512_Init(ctx) +# define archive_sha512_final(ctx, buf) SHA512_Final(buf, ctx) +# define archive_sha512_update(ctx, buf, n) SHA512_Update(ctx, buf, n) +#endif diff --git a/lib/libarchive/archive_platform.h b/lib/libarchive/archive_platform.h new file mode 100644 index 000000000..625296e16 --- /dev/null +++ b/lib/libarchive/archive_platform.h @@ -0,0 +1,167 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD: head/lib/libarchive/archive_platform.h 201090 2009-12-28 02:22:04Z kientzle $ + */ + +/* !!ONLY FOR USE INTERNALLY TO LIBARCHIVE!! */ + +/* + * This header is the first thing included in any of the libarchive + * source files. As far as possible, platform-specific issues should + * be dealt with here and not within individual source files. I'm + * actively trying to minimize #if blocks within the main source, + * since they obfuscate the code. + */ + +#ifndef ARCHIVE_PLATFORM_H_INCLUDED +#define ARCHIVE_PLATFORM_H_INCLUDED + +/* archive.h and archive_entry.h require this. */ +#define __LIBARCHIVE_BUILD 1 + +#if defined(PLATFORM_CONFIG_H) +/* Use hand-built config.h in environments that need it. */ +#include PLATFORM_CONFIG_H +#elif defined(HAVE_CONFIG_H) +/* Most POSIX platforms use the 'configure' script to build config.h */ +#include "config.h" +#else +/* Warn if the library hasn't been (automatically or manually) configured. */ +#error Oops: No config.h and no pre-built configuration in archive_platform.h. +#endif + +/* It should be possible to get rid of this by extending the feature-test + * macros to cover Windows API functions, probably along with non-trivial + * refactoring of code to find structures that sit more cleanly on top of + * either Windows or Posix APIs. */ +#if (defined(__WIN32__) || defined(_WIN32) || defined(__WIN32)) && !defined(__CYGWIN__) +#include "archive_windows.h" +#endif + +/* + * The config files define a lot of feature macros. The following + * uses those macros to select/define replacements and include key + * headers as required. + */ + +/* Get a real definition for __FBSDID if we can */ +#if HAVE_SYS_CDEFS_H +#include +#endif + +/* If not, define it so as to avoid dangling semicolons. */ +#ifndef __FBSDID +#define __FBSDID(a) struct _undefined_hack +#endif + +/* Try to get standard C99-style integer type definitions. */ +#if HAVE_INTTYPES_H +#include +#endif +#if HAVE_STDINT_H +#include +#endif + +/* Borland warns about its own constants! */ +#if defined(__BORLANDC__) +# if HAVE_DECL_UINT64_MAX +# undef UINT64_MAX +# undef HAVE_DECL_UINT64_MAX +# endif +# if HAVE_DECL_UINT64_MIN +# undef UINT64_MIN +# undef HAVE_DECL_UINT64_MIN +# endif +# if HAVE_DECL_INT64_MAX +# undef INT64_MAX +# undef HAVE_DECL_INT64_MAX +# endif +# if HAVE_DECL_INT64_MIN +# undef INT64_MIN +# undef HAVE_DECL_INT64_MIN +# endif +#endif + +/* Some platforms lack the standard *_MAX definitions. */ +#ifndef __minix +#if !HAVE_DECL_SIZE_MAX +#define SIZE_MAX (~(size_t)0) +#endif +#if !HAVE_DECL_SSIZE_MAX +#define SSIZE_MAX ((ssize_t)(SIZE_MAX >> 1)) +#endif +#if !HAVE_DECL_UINT32_MAX +#define UINT32_MAX (~(uint32_t)0) +#endif +#if !HAVE_DECL_UINT64_MAX +#define UINT64_MAX (~(uint64_t)0) +#endif +#if !HAVE_DECL_INT64_MAX +#define INT64_MAX ((int64_t)(UINT64_MAX >> 1)) +#endif +#if !HAVE_DECL_INT64_MIN +#define INT64_MIN ((int64_t)(~INT64_MAX)) +#endif +#endif + +/* + * If this platform has , acl_create(), acl_init(), + * acl_set_file(), and ACL_USER, we assume it has the rest of the + * POSIX.1e draft functions used in archive_read_extract.c. + */ +#if HAVE_SYS_ACL_H && HAVE_ACL_CREATE_ENTRY && HAVE_ACL_INIT && HAVE_ACL_SET_FILE && HAVE_ACL_USER +#define HAVE_POSIX_ACL 1 +#endif + +/* + * If we can't restore metadata using a file descriptor, then + * for compatibility's sake, close files before trying to restore metadata. + */ +#if defined(HAVE_FCHMOD) || defined(HAVE_FUTIMES) || defined(HAVE_ACL_SET_FD) || defined(HAVE_ACL_SET_FD_NP) || defined(HAVE_FCHOWN) +#define CAN_RESTORE_METADATA_FD +#endif + +/* Set up defaults for internal error codes. */ +#ifndef ARCHIVE_ERRNO_FILE_FORMAT +#if HAVE_EFTYPE +#define ARCHIVE_ERRNO_FILE_FORMAT EFTYPE +#else +#if HAVE_EILSEQ +#define ARCHIVE_ERRNO_FILE_FORMAT EILSEQ +#else +#define ARCHIVE_ERRNO_FILE_FORMAT EINVAL +#endif +#endif +#endif + +#ifndef ARCHIVE_ERRNO_PROGRAMMER +#define ARCHIVE_ERRNO_PROGRAMMER EINVAL +#endif + +#ifndef ARCHIVE_ERRNO_MISC +#define ARCHIVE_ERRNO_MISC (-1) +#endif + +#endif /* !ARCHIVE_PLATFORM_H_INCLUDED */ diff --git a/lib/libarchive/archive_private.h b/lib/libarchive/archive_private.h new file mode 100644 index 000000000..88a32ea87 --- /dev/null +++ b/lib/libarchive/archive_private.h @@ -0,0 +1,130 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD: head/lib/libarchive/archive_private.h 201098 2009-12-28 02:58:14Z kientzle $ + */ + +#ifndef __LIBARCHIVE_BUILD +#error This header is only to be used internally to libarchive. +#endif + +#ifndef ARCHIVE_PRIVATE_H_INCLUDED +#define ARCHIVE_PRIVATE_H_INCLUDED + +#include "archive.h" +#include "archive_string.h" + +#if defined(__GNUC__) && (__GNUC__ > 2 || \ + (__GNUC__ == 2 && __GNUC_MINOR__ >= 5)) +#define __LA_DEAD __attribute__((__noreturn__)) +#else +#define __LA_DEAD +#endif + +#define ARCHIVE_WRITE_MAGIC (0xb0c5c0deU) +#define ARCHIVE_READ_MAGIC (0xdeb0c5U) +#define ARCHIVE_WRITE_DISK_MAGIC (0xc001b0c5U) +#define ARCHIVE_READ_DISK_MAGIC (0xbadb0c5U) + +#define ARCHIVE_STATE_ANY 0xFFFFU +#define ARCHIVE_STATE_NEW 1U +#define ARCHIVE_STATE_HEADER 2U +#define ARCHIVE_STATE_DATA 4U +#define ARCHIVE_STATE_DATA_END 8U +#define ARCHIVE_STATE_EOF 0x10U +#define ARCHIVE_STATE_CLOSED 0x20U +#define ARCHIVE_STATE_FATAL 0x8000U + +struct archive_vtable { + int (*archive_close)(struct archive *); + int (*archive_finish)(struct archive *); + int (*archive_write_header)(struct archive *, + struct archive_entry *); + int (*archive_write_finish_entry)(struct archive *); + ssize_t (*archive_write_data)(struct archive *, + const void *, size_t); + ssize_t (*archive_write_data_block)(struct archive *, + const void *, size_t, off_t); +}; + +struct archive { + /* + * The magic/state values are used to sanity-check the + * client's usage. If an API function is called at a + * ridiculous time, or the client passes us an invalid + * pointer, these values allow me to catch that. + */ + unsigned int magic; + unsigned int state; + + /* + * Some public API functions depend on the "real" type of the + * archive object. + */ + struct archive_vtable *vtable; + + int archive_format; + const char *archive_format_name; + + int compression_code; /* Currently active compression. */ + const char *compression_name; + +#ifndef __minix + /* Position in UNCOMPRESSED data stream. */ + int64_t file_position; + /* Position in COMPRESSED data stream. */ + int64_t raw_position; +#else + /* Position in UNCOMPRESSED data stream. */ + off_t file_position; + /* Position in COMPRESSED data stream. */ + off_t raw_position; +#endif + int file_count; + /* Number of file entries processed. */ + int archive_error_number; + const char *error; + struct archive_string error_string; +}; + +/* Check magic value and state; exit if it isn't valid. */ +void __archive_check_magic(struct archive *, unsigned int magic, + unsigned int state, const char *func); + +void __archive_errx(int retvalue, const char *msg) __LA_DEAD; + +int __archive_parse_options(const char *p, const char *fn, + int keysize, char *key, int valsize, char *val); + +#define err_combine(a,b) ((a) < (b) ? (a) : (b)) + +#if defined(__BORLANDC__) || (defined(_MSC_VER) && _MSC_VER <= 1300) +# define ARCHIVE_LITERAL_LL(x) x##i64 +# define ARCHIVE_LITERAL_ULL(x) x##ui64 +#else +# define ARCHIVE_LITERAL_LL(x) x##ll +# define ARCHIVE_LITERAL_ULL(x) x##ull +#endif + +#endif diff --git a/lib/libarchive/archive_read.3 b/lib/libarchive/archive_read.3 new file mode 100644 index 000000000..43f3c7632 --- /dev/null +++ b/lib/libarchive/archive_read.3 @@ -0,0 +1,714 @@ +.\" Copyright (c) 2003-2007 Tim Kientzle +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD: head/lib/libarchive/archive_read.3 191595 2009-04-27 20:13:13Z kientzle $ +.\" +.Dd April 13, 2009 +.Dt archive_read 3 +.Os +.Sh NAME +.Nm archive_read_new , +.Nm archive_read_set_filter_options , +.Nm archive_read_set_format_options , +.Nm archive_read_set_options , +.Nm archive_read_support_compression_all , +.Nm archive_read_support_compression_bzip2 , +.Nm archive_read_support_compression_compress , +.Nm archive_read_support_compression_gzip , +.Nm archive_read_support_compression_lzma , +.Nm archive_read_support_compression_none , +.Nm archive_read_support_compression_xz , +.Nm archive_read_support_compression_program , +.Nm archive_read_support_compression_program_signature , +.Nm archive_read_support_format_all , +.Nm archive_read_support_format_ar , +.Nm archive_read_support_format_cpio , +.Nm archive_read_support_format_empty , +.Nm archive_read_support_format_iso9660 , +.Nm archive_read_support_format_mtree, +.Nm archive_read_support_format_raw, +.Nm archive_read_support_format_tar , +.Nm archive_read_support_format_zip , +.Nm archive_read_open , +.Nm archive_read_open2 , +.Nm archive_read_open_fd , +.Nm archive_read_open_FILE , +.Nm archive_read_open_filename , +.Nm archive_read_open_memory , +.Nm archive_read_next_header , +.Nm archive_read_next_header2 , +.Nm archive_read_data , +.Nm archive_read_data_block , +.Nm archive_read_data_skip , +.\" #if ARCHIVE_API_VERSION < 3 +.Nm archive_read_data_into_buffer , +.\" #endif +.Nm archive_read_data_into_fd , +.Nm archive_read_extract , +.Nm archive_read_extract2 , +.Nm archive_read_extract_set_progress_callback , +.Nm archive_read_close , +.Nm archive_read_finish +.Nd functions for reading streaming archives +.Sh SYNOPSIS +.In archive.h +.Ft struct archive * +.Fn archive_read_new "void" +.Ft int +.Fn archive_read_support_compression_all "struct archive *" +.Ft int +.Fn archive_read_support_compression_bzip2 "struct archive *" +.Ft int +.Fn archive_read_support_compression_compress "struct archive *" +.Ft int +.Fn archive_read_support_compression_gzip "struct archive *" +.Ft int +.Fn archive_read_support_compression_lzma "struct archive *" +.Ft int +.Fn archive_read_support_compression_none "struct archive *" +.Ft int +.Fn archive_read_support_compression_xz "struct archive *" +.Ft int +.Fo archive_read_support_compression_program +.Fa "struct archive *" +.Fa "const char *cmd" +.Fc +.Ft int +.Fo archive_read_support_compression_program_signature +.Fa "struct archive *" +.Fa "const char *cmd" +.Fa "const void *signature" +.Fa "size_t signature_length" +.Fc +.Ft int +.Fn archive_read_support_format_all "struct archive *" +.Ft int +.Fn archive_read_support_format_ar "struct archive *" +.Ft int +.Fn archive_read_support_format_cpio "struct archive *" +.Ft int +.Fn archive_read_support_format_empty "struct archive *" +.Ft int +.Fn archive_read_support_format_iso9660 "struct archive *" +.Ft int +.Fn archive_read_support_format_mtree "struct archive *" +.Ft int +.Fn archive_read_support_format_raw "struct archive *" +.Ft int +.Fn archive_read_support_format_tar "struct archive *" +.Ft int +.Fn archive_read_support_format_zip "struct archive *" +.Ft int +.Fn archive_read_set_filter_options "struct archive *" "const char *" +.Ft int +.Fn archive_read_set_format_options "struct archive *" "const char *" +.Ft int +.Fn archive_read_set_options "struct archive *" "const char *" +.Ft int +.Fo archive_read_open +.Fa "struct archive *" +.Fa "void *client_data" +.Fa "archive_open_callback *" +.Fa "archive_read_callback *" +.Fa "archive_close_callback *" +.Fc +.Ft int +.Fo archive_read_open2 +.Fa "struct archive *" +.Fa "void *client_data" +.Fa "archive_open_callback *" +.Fa "archive_read_callback *" +.Fa "archive_skip_callback *" +.Fa "archive_close_callback *" +.Fc +.Ft int +.Fn archive_read_open_FILE "struct archive *" "FILE *file" +.Ft int +.Fn archive_read_open_fd "struct archive *" "int fd" "size_t block_size" +.Ft int +.Fo archive_read_open_filename +.Fa "struct archive *" +.Fa "const char *filename" +.Fa "size_t block_size" +.Fc +.Ft int +.Fn archive_read_open_memory "struct archive *" "void *buff" "size_t size" +.Ft int +.Fn archive_read_next_header "struct archive *" "struct archive_entry **" +.Ft int +.Fn archive_read_next_header2 "struct archive *" "struct archive_entry *" +.Ft ssize_t +.Fn archive_read_data "struct archive *" "void *buff" "size_t len" +.Ft int +.Fo archive_read_data_block +.Fa "struct archive *" +.Fa "const void **buff" +.Fa "size_t *len" +.Fa "off_t *offset" +.Fc +.Ft int +.Fn archive_read_data_skip "struct archive *" +.\" #if ARCHIVE_API_VERSION < 3 +.Ft int +.Fn archive_read_data_into_buffer "struct archive *" "void *" "ssize_t len" +.\" #endif +.Ft int +.Fn archive_read_data_into_fd "struct archive *" "int fd" +.Ft int +.Fo archive_read_extract +.Fa "struct archive *" +.Fa "struct archive_entry *" +.Fa "int flags" +.Fc +.Ft int +.Fo archive_read_extract2 +.Fa "struct archive *src" +.Fa "struct archive_entry *" +.Fa "struct archive *dest" +.Fc +.Ft void +.Fo archive_read_extract_set_progress_callback +.Fa "struct archive *" +.Fa "void (*func)(void *)" +.Fa "void *user_data" +.Fc +.Ft int +.Fn archive_read_close "struct archive *" +.Ft int +.Fn archive_read_finish "struct archive *" +.Sh DESCRIPTION +These functions provide a complete API for reading streaming archives. +The general process is to first create the +.Tn struct archive +object, set options, initialize the reader, iterate over the archive +headers and associated data, then close the archive and release all +resources. +The following summary describes the functions in approximately the +order they would be used: +.Bl -tag -compact -width indent +.It Fn archive_read_new +Allocates and initializes a +.Tn struct archive +object suitable for reading from an archive. +.It Xo +.Fn archive_read_support_compression_bzip2 , +.Fn archive_read_support_compression_compress , +.Fn archive_read_support_compression_gzip , +.Fn archive_read_support_compression_lzma , +.Fn archive_read_support_compression_none , +.Fn archive_read_support_compression_xz +.Xc +Enables auto-detection code and decompression support for the +specified compression. +Returns +.Cm ARCHIVE_OK +if the compression is fully supported, or +.Cm ARCHIVE_WARN +if the compression is supported only through an external program. +Note that decompression using an external program is usually slower than +decompression through built-in libraries. +Note that +.Dq none +is always enabled by default. +.It Fn archive_read_support_compression_all +Enables all available decompression filters. +.It Fn archive_read_support_compression_program +Data is fed through the specified external program before being dearchived. +Note that this disables automatic detection of the compression format, +so it makes no sense to specify this in conjunction with any other +decompression option. +.It Fn archive_read_support_compression_program_signature +This feeds data through the specified external program +but only if the initial bytes of the data match the specified +signature value. +.It Xo +.Fn archive_read_support_format_all , +.Fn archive_read_support_format_ar , +.Fn archive_read_support_format_cpio , +.Fn archive_read_support_format_empty , +.Fn archive_read_support_format_iso9660 , +.Fn archive_read_support_format_mtree , +.Fn archive_read_support_format_tar , +.Fn archive_read_support_format_zip +.Xc +Enables support---including auto-detection code---for the +specified archive format. +For example, +.Fn archive_read_support_format_tar +enables support for a variety of standard tar formats, old-style tar, +ustar, pax interchange format, and many common variants. +For convenience, +.Fn archive_read_support_format_all +enables support for all available formats. +Only empty archives are supported by default. +.It Fn archive_read_support_format_raw +The +.Dq raw +format handler allows libarchive to be used to read arbitrary data. +It treats any data stream as an archive with a single entry. +The pathname of this entry is +.Dq data ; +all other entry fields are unset. +This is not enabled by +.Fn archive_read_support_format_all +in order to avoid erroneous handling of damaged archives. +.It Xo +.Fn archive_read_set_filter_options , +.Fn archive_read_set_format_options , +.Fn archive_read_set_options +.Xc +Specifies options that will be passed to currently-registered +filters (including decompression filters) and/or format readers. +The argument is a comma-separated list of individual options. +Individual options have one of the following forms: +.Bl -tag -compact -width indent +.It Ar option=value +The option/value pair will be provided to every module. +Modules that do not accept an option with this name will ignore it. +.It Ar option +The option will be provided to every module with a value of +.Dq 1 . +.It Ar !option +The option will be provided to every module with a NULL value. +.It Ar module:option=value , Ar module:option , Ar module:!option +As above, but the corresponding option and value will be provided +only to modules whose name matches +.Ar module . +.El +The return value will be +.Cm ARCHIVE_OK +if any module accepts the option, or +.Cm ARCHIVE_WARN +if no module accepted the option, or +.Cm ARCHIVE_FATAL +if there was a fatal error while attempting to process the option. +.Pp +The currently supported options are: +.Bl -tag -compact -width indent +.It Format iso9660 +.Bl -tag -compact -width indent +.It Cm joliet +Support Joliet extensions. +Defaults to enabled, use +.Cm !joliet +to disable. +.El +.El +.It Fn archive_read_open +The same as +.Fn archive_read_open2 , +except that the skip callback is assumed to be +.Dv NULL . +.It Fn archive_read_open2 +Freeze the settings, open the archive, and prepare for reading entries. +This is the most generic version of this call, which accepts +four callback functions. +Most clients will want to use +.Fn archive_read_open_filename , +.Fn archive_read_open_FILE , +.Fn archive_read_open_fd , +or +.Fn archive_read_open_memory +instead. +The library invokes the client-provided functions to obtain +raw bytes from the archive. +.It Fn archive_read_open_FILE +Like +.Fn archive_read_open , +except that it accepts a +.Ft "FILE *" +pointer. +This function should not be used with tape drives or other devices +that require strict I/O blocking. +.It Fn archive_read_open_fd +Like +.Fn archive_read_open , +except that it accepts a file descriptor and block size rather than +a set of function pointers. +Note that the file descriptor will not be automatically closed at +end-of-archive. +This function is safe for use with tape drives or other blocked devices. +.It Fn archive_read_open_file +This is a deprecated synonym for +.Fn archive_read_open_filename . +.It Fn archive_read_open_filename +Like +.Fn archive_read_open , +except that it accepts a simple filename and a block size. +A NULL filename represents standard input. +This function is safe for use with tape drives or other blocked devices. +.It Fn archive_read_open_memory +Like +.Fn archive_read_open , +except that it accepts a pointer and size of a block of +memory containing the archive data. +.It Fn archive_read_next_header +Read the header for the next entry and return a pointer to +a +.Tn struct archive_entry . +This is a convenience wrapper around +.Fn archive_read_next_header2 +that reuses an internal +.Tn struct archive_entry +object for each request. +.It Fn archive_read_next_header2 +Read the header for the next entry and populate the provided +.Tn struct archive_entry . +.It Fn archive_read_data +Read data associated with the header just read. +Internally, this is a convenience function that calls +.Fn archive_read_data_block +and fills any gaps with nulls so that callers see a single +continuous stream of data. +.It Fn archive_read_data_block +Return the next available block of data for this entry. +Unlike +.Fn archive_read_data , +the +.Fn archive_read_data_block +function avoids copying data and allows you to correctly handle +sparse files, as supported by some archive formats. +The library guarantees that offsets will increase and that blocks +will not overlap. +Note that the blocks returned from this function can be much larger +than the block size read from disk, due to compression +and internal buffer optimizations. +.It Fn archive_read_data_skip +A convenience function that repeatedly calls +.Fn archive_read_data_block +to skip all of the data for this archive entry. +.\" #if ARCHIVE_API_VERSION < 3 +.It Fn archive_read_data_into_buffer +This function is deprecated and will be removed. +Use +.Fn archive_read_data +instead. +.\" #endif +.It Fn archive_read_data_into_fd +A convenience function that repeatedly calls +.Fn archive_read_data_block +to copy the entire entry to the provided file descriptor. +.It Fn archive_read_extract , Fn archive_read_extract_set_skip_file +A convenience function that wraps the corresponding +.Xr archive_write_disk 3 +interfaces. +The first call to +.Fn archive_read_extract +creates a restore object using +.Xr archive_write_disk_new 3 +and +.Xr archive_write_disk_set_standard_lookup 3 , +then transparently invokes +.Xr archive_write_disk_set_options 3 , +.Xr archive_write_header 3 , +.Xr archive_write_data 3 , +and +.Xr archive_write_finish_entry 3 +to create the entry on disk and copy data into it. +The +.Va flags +argument is passed unmodified to +.Xr archive_write_disk_set_options 3 . +.It Fn archive_read_extract2 +This is another version of +.Fn archive_read_extract +that allows you to provide your own restore object. +In particular, this allows you to override the standard lookup functions +using +.Xr archive_write_disk_set_group_lookup 3 , +and +.Xr archive_write_disk_set_user_lookup 3 . +Note that +.Fn archive_read_extract2 +does not accept a +.Va flags +argument; you should use +.Fn archive_write_disk_set_options +to set the restore options yourself. +.It Fn archive_read_extract_set_progress_callback +Sets a pointer to a user-defined callback that can be used +for updating progress displays during extraction. +The progress function will be invoked during the extraction of large +regular files. +The progress function will be invoked with the pointer provided to this call. +Generally, the data pointed to should include a reference to the archive +object and the archive_entry object so that various statistics +can be retrieved for the progress display. +.It Fn archive_read_close +Complete the archive and invoke the close callback. +.It Fn archive_read_finish +Invokes +.Fn archive_read_close +if it was not invoked manually, then release all resources. +Note: In libarchive 1.x, this function was declared to return +.Ft void , +which made it impossible to detect certain errors when +.Fn archive_read_close +was invoked implicitly from this function. +The declaration is corrected beginning with libarchive 2.0. +.El +.Pp +Note that the library determines most of the relevant information about +the archive by inspection. +In particular, it automatically detects +.Xr gzip 1 +or +.Xr bzip2 1 +compression and transparently performs the appropriate decompression. +It also automatically detects the archive format. +.Pp +A complete description of the +.Tn struct archive +and +.Tn struct archive_entry +objects can be found in the overview manual page for +.Xr libarchive 3 . +.Sh CLIENT CALLBACKS +The callback functions must match the following prototypes: +.Bl -item -offset indent +.It +.Ft typedef ssize_t +.Fo archive_read_callback +.Fa "struct archive *" +.Fa "void *client_data" +.Fa "const void **buffer" +.Fc +.It +.\" #if ARCHIVE_API_VERSION < 2 +.Ft typedef int +.Fo archive_skip_callback +.Fa "struct archive *" +.Fa "void *client_data" +.Fa "size_t request" +.Fc +.\" #else +.\" .Ft typedef off_t +.\" .Fo archive_skip_callback +.\" .Fa "struct archive *" +.\" .Fa "void *client_data" +.\" .Fa "off_t request" +.\" .Fc +.\" #endif +.It +.Ft typedef int +.Fn archive_open_callback "struct archive *" "void *client_data" +.It +.Ft typedef int +.Fn archive_close_callback "struct archive *" "void *client_data" +.El +.Pp +The open callback is invoked by +.Fn archive_open . +It should return +.Cm ARCHIVE_OK +if the underlying file or data source is successfully +opened. +If the open fails, it should call +.Fn archive_set_error +to register an error code and message and return +.Cm ARCHIVE_FATAL . +.Pp +The read callback is invoked whenever the library +requires raw bytes from the archive. +The read callback should read data into a buffer, +set the +.Li const void **buffer +argument to point to the available data, and +return a count of the number of bytes available. +The library will invoke the read callback again +only after it has consumed this data. +The library imposes no constraints on the size +of the data blocks returned. +On end-of-file, the read callback should +return zero. +On error, the read callback should invoke +.Fn archive_set_error +to register an error code and message and +return -1. +.Pp +The skip callback is invoked when the +library wants to ignore a block of data. +The return value is the number of bytes actually +skipped, which may differ from the request. +If the callback cannot skip data, it should return +zero. +If the skip callback is not provided (the +function pointer is +.Dv NULL ), +the library will invoke the read function +instead and simply discard the result. +A skip callback can provide significant +performance gains when reading uncompressed +archives from slow disk drives or other media +that can skip quickly. +.Pp +The close callback is invoked by archive_close when +the archive processing is complete. +The callback should return +.Cm ARCHIVE_OK +on success. +On failure, the callback should invoke +.Fn archive_set_error +to register an error code and message and +return +.Cm ARCHIVE_FATAL. +.Sh EXAMPLE +The following illustrates basic usage of the library. +In this example, +the callback functions are simply wrappers around the standard +.Xr open 2 , +.Xr read 2 , +and +.Xr close 2 +system calls. +.Bd -literal -offset indent +void +list_archive(const char *name) +{ + struct mydata *mydata; + struct archive *a; + struct archive_entry *entry; + + mydata = malloc(sizeof(struct mydata)); + a = archive_read_new(); + mydata->name = name; + archive_read_support_compression_all(a); + archive_read_support_format_all(a); + archive_read_open(a, mydata, myopen, myread, myclose); + while (archive_read_next_header(a, &entry) == ARCHIVE_OK) { + printf("%s\\n",archive_entry_pathname(entry)); + archive_read_data_skip(a); + } + archive_read_finish(a); + free(mydata); +} + +ssize_t +myread(struct archive *a, void *client_data, const void **buff) +{ + struct mydata *mydata = client_data; + + *buff = mydata->buff; + return (read(mydata->fd, mydata->buff, 10240)); +} + +int +myopen(struct archive *a, void *client_data) +{ + struct mydata *mydata = client_data; + + mydata->fd = open(mydata->name, O_RDONLY); + return (mydata->fd >= 0 ? ARCHIVE_OK : ARCHIVE_FATAL); +} + +int +myclose(struct archive *a, void *client_data) +{ + struct mydata *mydata = client_data; + + if (mydata->fd > 0) + close(mydata->fd); + return (ARCHIVE_OK); +} +.Ed +.Sh RETURN VALUES +Most functions return zero on success, non-zero on error. +The possible return codes include: +.Cm ARCHIVE_OK +(the operation succeeded), +.Cm ARCHIVE_WARN +(the operation succeeded but a non-critical error was encountered), +.Cm ARCHIVE_EOF +(end-of-archive was encountered), +.Cm ARCHIVE_RETRY +(the operation failed but can be retried), +and +.Cm ARCHIVE_FATAL +(there was a fatal error; the archive should be closed immediately). +Detailed error codes and textual descriptions are available from the +.Fn archive_errno +and +.Fn archive_error_string +functions. +.Pp +.Fn archive_read_new +returns a pointer to a freshly allocated +.Tn struct archive +object. +It returns +.Dv NULL +on error. +.Pp +.Fn archive_read_data +returns a count of bytes actually read or zero at the end of the entry. +On error, a value of +.Cm ARCHIVE_FATAL , +.Cm ARCHIVE_WARN , +or +.Cm ARCHIVE_RETRY +is returned and an error code and textual description can be retrieved from the +.Fn archive_errno +and +.Fn archive_error_string +functions. +.Pp +The library expects the client callbacks to behave similarly. +If there is an error, you can use +.Fn archive_set_error +to set an appropriate error code and description, +then return one of the non-zero values above. +(Note that the value eventually returned to the client may +not be the same; many errors that are not critical at the level +of basic I/O can prevent the archive from being properly read, +thus most I/O errors eventually cause +.Cm ARCHIVE_FATAL +to be returned.) +.\" .Sh ERRORS +.Sh SEE ALSO +.Xr tar 1 , +.Xr archive 3 , +.Xr archive_util 3 , +.Xr tar 5 +.Sh HISTORY +The +.Nm libarchive +library first appeared in +.Fx 5.3 . +.Sh AUTHORS +.An -nosplit +The +.Nm libarchive +library was written by +.An Tim Kientzle Aq kientzle@acm.org . +.Sh BUGS +Many traditional archiver programs treat +empty files as valid empty archives. +For example, many implementations of +.Xr tar 1 +allow you to append entries to an empty file. +Of course, it is impossible to determine the format of an empty file +by inspecting the contents, so this library treats empty files as +having a special +.Dq empty +format. diff --git a/lib/libarchive/archive_read.c b/lib/libarchive/archive_read.c new file mode 100644 index 000000000..4c03bef29 --- /dev/null +++ b/lib/libarchive/archive_read.c @@ -0,0 +1,1385 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +/* + * This file contains the "essential" portions of the read API, that + * is, stuff that will probably always be used by any client that + * actually needs to read an archive. Optional pieces have been, as + * far as possible, separated out into separate files to avoid + * needlessly bloating statically-linked clients. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read.c 201157 2009-12-29 05:30:23Z kientzle $"); + +#ifdef HAVE_ERRNO_H +#include +#endif +#include +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#ifdef HAVE_UNISTD_H +#include +#endif + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" +#include "archive_read_private.h" + +#define minimum(a, b) (a < b ? a : b) + +static int build_stream(struct archive_read *); +static int choose_format(struct archive_read *); +static int cleanup_filters(struct archive_read *); +static struct archive_vtable *archive_read_vtable(void); +static int _archive_read_close(struct archive *); +static int _archive_read_finish(struct archive *); + +static struct archive_vtable * +archive_read_vtable(void) +{ + static struct archive_vtable av; + static int inited = 0; + + if (!inited) { + av.archive_finish = _archive_read_finish; + av.archive_close = _archive_read_close; + } + return (&av); +} + +/* + * Allocate, initialize and return a struct archive object. + */ +struct archive * +archive_read_new(void) +{ + struct archive_read *a; + + a = (struct archive_read *)malloc(sizeof(*a)); + if (a == NULL) + return (NULL); + memset(a, 0, sizeof(*a)); + a->archive.magic = ARCHIVE_READ_MAGIC; + + a->archive.state = ARCHIVE_STATE_NEW; + a->entry = archive_entry_new(); + a->archive.vtable = archive_read_vtable(); + + return (&a->archive); +} + +/* + * Record the do-not-extract-to file. This belongs in archive_read_extract.c. + */ +void +archive_read_extract_set_skip_file(struct archive *_a, dev_t d, ino_t i) +{ + struct archive_read *a = (struct archive_read *)_a; + __archive_check_magic(_a, ARCHIVE_READ_MAGIC, ARCHIVE_STATE_ANY, + "archive_read_extract_set_skip_file"); + a->skip_file_dev = d; + a->skip_file_ino = i; +} + +/* + * Set read options for the format. + */ +int +archive_read_set_format_options(struct archive *_a, const char *s) +{ + struct archive_read *a; + struct archive_format_descriptor *format; + char key[64], val[64]; + char *valp; + size_t i; + int len, r; + + __archive_check_magic(_a, ARCHIVE_READ_MAGIC, ARCHIVE_STATE_NEW, + "archive_read_set_format_options"); + + if (s == NULL || *s == '\0') + return (ARCHIVE_OK); + a = (struct archive_read *)_a; + __archive_check_magic(&a->archive, ARCHIVE_READ_MAGIC, + ARCHIVE_STATE_NEW, "archive_read_set_format_options"); + len = 0; + for (i = 0; i < sizeof(a->formats)/sizeof(a->formats[0]); i++) { + format = &a->formats[i]; + if (format == NULL || format->options == NULL || + format->name == NULL) + /* This format does not support option. */ + continue; + + while ((len = __archive_parse_options(s, format->name, + sizeof(key), key, sizeof(val), val)) > 0) { + valp = val[0] == '\0' ? NULL : val; + a->format = format; + r = format->options(a, key, valp); + a->format = NULL; + if (r == ARCHIVE_FATAL) + return (r); + s += len; + } + } + if (len < 0) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Illegal format options."); + return (ARCHIVE_WARN); + } + return (ARCHIVE_OK); +} + +/* + * Set read options for the filter. + */ +int +archive_read_set_filter_options(struct archive *_a, const char *s) +{ + struct archive_read *a; + struct archive_read_filter *filter; + struct archive_read_filter_bidder *bidder; + char key[64], val[64]; + int len, r; + + __archive_check_magic(_a, ARCHIVE_READ_MAGIC, ARCHIVE_STATE_NEW, + "archive_read_set_filter_options"); + + if (s == NULL || *s == '\0') + return (ARCHIVE_OK); + a = (struct archive_read *)_a; + __archive_check_magic(&a->archive, ARCHIVE_READ_MAGIC, + ARCHIVE_STATE_NEW, "archive_read_set_filter_options"); + len = 0; + for (filter = a->filter; filter != NULL; filter = filter->upstream) { + bidder = filter->bidder; + if (bidder == NULL) + continue; + if (bidder->options == NULL) + /* This bidder does not support option */ + continue; + while ((len = __archive_parse_options(s, filter->name, + sizeof(key), key, sizeof(val), val)) > 0) { + if (val[0] == '\0') + r = bidder->options(bidder, key, NULL); + else + r = bidder->options(bidder, key, val); + if (r == ARCHIVE_FATAL) + return (r); + s += len; + } + } + if (len < 0) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Illegal format options."); + return (ARCHIVE_WARN); + } + return (ARCHIVE_OK); +} + +/* + * Set read options for the format and the filter. + */ +int +archive_read_set_options(struct archive *_a, const char *s) +{ + int r; + + __archive_check_magic(_a, ARCHIVE_READ_MAGIC, ARCHIVE_STATE_NEW, + "archive_read_set_options"); + archive_clear_error(_a); + + r = archive_read_set_format_options(_a, s); + if (r != ARCHIVE_OK) + return (r); + r = archive_read_set_filter_options(_a, s); + if (r != ARCHIVE_OK) + return (r); + return (ARCHIVE_OK); +} + +/* + * Open the archive + */ +int +archive_read_open(struct archive *a, void *client_data, + archive_open_callback *client_opener, archive_read_callback *client_reader, + archive_close_callback *client_closer) +{ + /* Old archive_read_open() is just a thin shell around + * archive_read_open2. */ + return archive_read_open2(a, client_data, client_opener, + client_reader, NULL, client_closer); +} + +static ssize_t +client_read_proxy(struct archive_read_filter *self, const void **buff) +{ + ssize_t r; + r = (self->archive->client.reader)(&self->archive->archive, + self->data, buff); + self->archive->archive.raw_position += r; + return (r); +} + +#ifndef __minix +static int64_t +client_skip_proxy(struct archive_read_filter *self, int64_t request) +{ + int64_t ask, get, total; + /* Limit our maximum seek request to 1GB on platforms + * with 32-bit off_t (such as Windows). */ + int64_t skip_limit = ((int64_t)1) << (sizeof(off_t) * 8 - 2); + + if (self->archive->client.skipper == NULL) + return (0); + total = 0; + for (;;) { + ask = request; + if (ask > skip_limit) + ask = skip_limit; + get = (self->archive->client.skipper)(&self->archive->archive, + self->data, ask); + if (get == 0) + return (total); + request -= get; + self->archive->archive.raw_position += get; + total += get; + } +} +#else +static ssize_t +client_skip_proxy(struct archive_read_filter *self, ssize_t request) +{ + size_t ask, get, total; + /* Limit our maximum seek request to 1GB on platforms + * with 32-bit off_t (such as Windows). */ + size_t skip_limit = ((size_t)1) << (sizeof(off_t) * 8 - 2); + + if (self->archive->client.skipper == NULL) + return (0); + total = 0; + for (;;) { + ask = request; + if (ask > skip_limit) + ask = skip_limit; + get = (self->archive->client.skipper)(&self->archive->archive, + self->data, ask); + if (get == 0) + return (total); + request -= get; + self->archive->archive.raw_position += get; + total += get; + } +} +#endif + +static int +client_close_proxy(struct archive_read_filter *self) +{ + int r = ARCHIVE_OK; + + if (self->archive->client.closer != NULL) + r = (self->archive->client.closer)((struct archive *)self->archive, + self->data); + self->data = NULL; + return (r); +} + + +int +archive_read_open2(struct archive *_a, void *client_data, + archive_open_callback *client_opener, + archive_read_callback *client_reader, + archive_skip_callback *client_skipper, + archive_close_callback *client_closer) +{ + struct archive_read *a = (struct archive_read *)_a; + struct archive_read_filter *filter; + int e; + + __archive_check_magic(_a, ARCHIVE_READ_MAGIC, ARCHIVE_STATE_NEW, + "archive_read_open"); + archive_clear_error(&a->archive); + + if (client_reader == NULL) + __archive_errx(1, + "No reader function provided to archive_read_open"); + + /* Open data source. */ + if (client_opener != NULL) { + e =(client_opener)(&a->archive, client_data); + if (e != 0) { + /* If the open failed, call the closer to clean up. */ + if (client_closer) + (client_closer)(&a->archive, client_data); + return (e); + } + } + + /* Save the client functions and mock up the initial source. */ + a->client.reader = client_reader; + a->client.skipper = client_skipper; + a->client.closer = client_closer; + + filter = calloc(1, sizeof(*filter)); + if (filter == NULL) + return (ARCHIVE_FATAL); + filter->bidder = NULL; + filter->upstream = NULL; + filter->archive = a; + filter->data = client_data; + filter->read = client_read_proxy; + filter->skip = client_skip_proxy; + filter->close = client_close_proxy; + filter->name = "none"; + filter->code = ARCHIVE_COMPRESSION_NONE; + a->filter = filter; + + /* Build out the input pipeline. */ + e = build_stream(a); + if (e == ARCHIVE_OK) + a->archive.state = ARCHIVE_STATE_HEADER; + + return (e); +} + +/* + * Allow each registered stream transform to bid on whether + * it wants to handle this stream. Repeat until we've finished + * building the pipeline. + */ +static int +build_stream(struct archive_read *a) +{ + int number_bidders, i, bid, best_bid; + struct archive_read_filter_bidder *bidder, *best_bidder; + struct archive_read_filter *filter; + ssize_t avail; + int r; + + for (;;) { + number_bidders = sizeof(a->bidders) / sizeof(a->bidders[0]); + + best_bid = 0; + best_bidder = NULL; + + bidder = a->bidders; + for (i = 0; i < number_bidders; i++, bidder++) { + if (bidder->bid != NULL) { + bid = (bidder->bid)(bidder, a->filter); + if (bid > best_bid) { + best_bid = bid; + best_bidder = bidder; + } + } + } + + /* If no bidder, we're done. */ + if (best_bidder == NULL) { + a->archive.compression_name = a->filter->name; + a->archive.compression_code = a->filter->code; + return (ARCHIVE_OK); + } + + filter + = (struct archive_read_filter *)calloc(1, sizeof(*filter)); + if (filter == NULL) + return (ARCHIVE_FATAL); + filter->bidder = best_bidder; + filter->archive = a; + filter->upstream = a->filter; + r = (best_bidder->init)(filter); + if (r != ARCHIVE_OK) { + free(filter); + return (r); + } + a->filter = filter; + /* Verify the filter by asking it for some data. */ + __archive_read_filter_ahead(filter, 1, &avail); + if (avail < 0) { + cleanup_filters(a); + return (ARCHIVE_FATAL); + } + } +} + +/* + * Read header of next entry. + */ +int +archive_read_next_header2(struct archive *_a, struct archive_entry *entry) +{ + struct archive_read *a = (struct archive_read *)_a; + int slot, ret; + + __archive_check_magic(_a, ARCHIVE_READ_MAGIC, + ARCHIVE_STATE_HEADER | ARCHIVE_STATE_DATA, + "archive_read_next_header"); + + ++_a->file_count; + archive_entry_clear(entry); + archive_clear_error(&a->archive); + + /* + * If no format has yet been chosen, choose one. + */ + if (a->format == NULL) { + slot = choose_format(a); + if (slot < 0) { + a->archive.state = ARCHIVE_STATE_FATAL; + return (ARCHIVE_FATAL); + } + a->format = &(a->formats[slot]); + } + + /* + * If client didn't consume entire data, skip any remainder + * (This is especially important for GNU incremental directories.) + */ + if (a->archive.state == ARCHIVE_STATE_DATA) { + ret = archive_read_data_skip(&a->archive); + if (ret == ARCHIVE_EOF) { + archive_set_error(&a->archive, EIO, "Premature end-of-file."); + a->archive.state = ARCHIVE_STATE_FATAL; + return (ARCHIVE_FATAL); + } + if (ret != ARCHIVE_OK) + return (ret); + } + + /* Record start-of-header. */ + a->header_position = a->archive.file_position; + + ret = (a->format->read_header)(a, entry); + + /* + * EOF and FATAL are persistent at this layer. By + * modifying the state, we guarantee that future calls to + * read a header or read data will fail. + */ + switch (ret) { + case ARCHIVE_EOF: + a->archive.state = ARCHIVE_STATE_EOF; + break; + case ARCHIVE_OK: + a->archive.state = ARCHIVE_STATE_DATA; + break; + case ARCHIVE_WARN: + a->archive.state = ARCHIVE_STATE_DATA; + break; + case ARCHIVE_RETRY: + break; + case ARCHIVE_FATAL: + a->archive.state = ARCHIVE_STATE_FATAL; + break; + } + + a->read_data_output_offset = 0; + a->read_data_remaining = 0; + return (ret); +} + +int +archive_read_next_header(struct archive *_a, struct archive_entry **entryp) +{ + int ret; + struct archive_read *a = (struct archive_read *)_a; + *entryp = NULL; + ret = archive_read_next_header2(_a, a->entry); + *entryp = a->entry; + return ret; +} + +/* + * Allow each registered format to bid on whether it wants to handle + * the next entry. Return index of winning bidder. + */ +static int +choose_format(struct archive_read *a) +{ + int slots; + int i; + int bid, best_bid; + int best_bid_slot; + + slots = sizeof(a->formats) / sizeof(a->formats[0]); + best_bid = -1; + best_bid_slot = -1; + + /* Set up a->format and a->pformat_data for convenience of bidders. */ + a->format = &(a->formats[0]); + for (i = 0; i < slots; i++, a->format++) { + if (a->format->bid) { + bid = (a->format->bid)(a); + if (bid == ARCHIVE_FATAL) + return (ARCHIVE_FATAL); + if ((bid > best_bid) || (best_bid_slot < 0)) { + best_bid = bid; + best_bid_slot = i; + } + } + } + + /* + * There were no bidders; this is a serious programmer error + * and demands a quick and definitive abort. + */ + if (best_bid_slot < 0) + __archive_errx(1, "No formats were registered; you must " + "invoke at least one archive_read_support_format_XXX " + "function in order to successfully read an archive."); + + /* + * There were bidders, but no non-zero bids; this means we + * can't support this stream. + */ + if (best_bid < 1) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Unrecognized archive format"); + return (ARCHIVE_FATAL); + } + + return (best_bid_slot); +} + +/* + * Return the file offset (within the uncompressed data stream) where + * the last header started. + */ +#ifndef __minix +int64_t +archive_read_header_position(struct archive *_a) +{ + struct archive_read *a = (struct archive_read *)_a; + __archive_check_magic(_a, ARCHIVE_READ_MAGIC, + ARCHIVE_STATE_ANY, "archive_read_header_position"); + return (a->header_position); +} +#else +off_t +archive_read_header_position(struct archive *_a) +{ + struct archive_read *a = (struct archive_read *)_a; + __archive_check_magic(_a, ARCHIVE_READ_MAGIC, + ARCHIVE_STATE_ANY, "archive_read_header_position"); + return (a->header_position); +} +#endif + +/* + * Read data from an archive entry, using a read(2)-style interface. + * This is a convenience routine that just calls + * archive_read_data_block and copies the results into the client + * buffer, filling any gaps with zero bytes. Clients using this + * API can be completely ignorant of sparse-file issues; sparse files + * will simply be padded with nulls. + * + * DO NOT intermingle calls to this function and archive_read_data_block + * to read a single entry body. + */ +ssize_t +archive_read_data(struct archive *_a, void *buff, size_t s) +{ + struct archive_read *a = (struct archive_read *)_a; + char *dest; + const void *read_buf; + size_t bytes_read; + size_t len; + int r; + + bytes_read = 0; + dest = (char *)buff; + + while (s > 0) { + if (a->read_data_remaining == 0) { + read_buf = a->read_data_block; + r = archive_read_data_block(&a->archive, &read_buf, + &a->read_data_remaining, &a->read_data_offset); + a->read_data_block = read_buf; + if (r == ARCHIVE_EOF) + return (bytes_read); + /* + * Error codes are all negative, so the status + * return here cannot be confused with a valid + * byte count. (ARCHIVE_OK is zero.) + */ + if (r < ARCHIVE_OK) + return (r); + } + + if (a->read_data_offset < a->read_data_output_offset) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Encountered out-of-order sparse blocks"); + return (ARCHIVE_RETRY); + } + + /* Compute the amount of zero padding needed. */ + if (a->read_data_output_offset + (off_t)s < + a->read_data_offset) { + len = s; + } else if (a->read_data_output_offset < + a->read_data_offset) { + len = a->read_data_offset - + a->read_data_output_offset; + } else + len = 0; + + /* Add zeroes. */ + memset(dest, 0, len); + s -= len; + a->read_data_output_offset += len; + dest += len; + bytes_read += len; + + /* Copy data if there is any space left. */ + if (s > 0) { + len = a->read_data_remaining; + if (len > s) + len = s; + memcpy(dest, a->read_data_block, len); + s -= len; + a->read_data_block += len; + a->read_data_remaining -= len; + a->read_data_output_offset += len; + a->read_data_offset += len; + dest += len; + bytes_read += len; + } + } + return (bytes_read); +} + +#if ARCHIVE_API_VERSION < 3 +/* + * Obsolete function provided for compatibility only. Note that the API + * of this function doesn't allow the caller to detect if the remaining + * data from the archive entry is shorter than the buffer provided, or + * even if an error occurred while reading data. + */ +int +archive_read_data_into_buffer(struct archive *a, void *d, ssize_t len) +{ + + archive_read_data(a, d, len); + return (ARCHIVE_OK); +} +#endif + +/* + * Skip over all remaining data in this entry. + */ +int +archive_read_data_skip(struct archive *_a) +{ + struct archive_read *a = (struct archive_read *)_a; + int r; + const void *buff; + size_t size; + off_t offset; + + __archive_check_magic(_a, ARCHIVE_READ_MAGIC, ARCHIVE_STATE_DATA, + "archive_read_data_skip"); + + if (a->format->read_data_skip != NULL) + r = (a->format->read_data_skip)(a); + else { + while ((r = archive_read_data_block(&a->archive, + &buff, &size, &offset)) + == ARCHIVE_OK) + ; + } + + if (r == ARCHIVE_EOF) + r = ARCHIVE_OK; + + a->archive.state = ARCHIVE_STATE_HEADER; + return (r); +} + +/* + * Read the next block of entry data from the archive. + * This is a zero-copy interface; the client receives a pointer, + * size, and file offset of the next available block of data. + * + * Returns ARCHIVE_OK if the operation is successful, ARCHIVE_EOF if + * the end of entry is encountered. + */ +int +archive_read_data_block(struct archive *_a, + const void **buff, size_t *size, off_t *offset) +{ + struct archive_read *a = (struct archive_read *)_a; + __archive_check_magic(_a, ARCHIVE_READ_MAGIC, ARCHIVE_STATE_DATA, + "archive_read_data_block"); + + if (a->format->read_data == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_PROGRAMMER, + "Internal error: " + "No format_read_data_block function registered"); + return (ARCHIVE_FATAL); + } + + return (a->format->read_data)(a, buff, size, offset); +} + +/* + * Close the file and release most resources. + * + * Be careful: client might just call read_new and then read_finish. + * Don't assume we actually read anything or performed any non-trivial + * initialization. + */ +static int +_archive_read_close(struct archive *_a) +{ + struct archive_read *a = (struct archive_read *)_a; + int r = ARCHIVE_OK, r1 = ARCHIVE_OK; + size_t i, n; + + __archive_check_magic(&a->archive, ARCHIVE_READ_MAGIC, + ARCHIVE_STATE_ANY, "archive_read_close"); + archive_clear_error(&a->archive); + a->archive.state = ARCHIVE_STATE_CLOSED; + + + /* Call cleanup functions registered by optional components. */ + if (a->cleanup_archive_extract != NULL) + r = (a->cleanup_archive_extract)(a); + + /* TODO: Clean up the formatters. */ + + /* Release the filter objects. */ + r1 = cleanup_filters(a); + if (r1 < r) + r = r1; + + /* Release the bidder objects. */ + n = sizeof(a->bidders)/sizeof(a->bidders[0]); + for (i = 0; i < n; i++) { + if (a->bidders[i].free != NULL) { + r1 = (a->bidders[i].free)(&a->bidders[i]); + if (r1 < r) + r = r1; + } + } + + return (r); +} + +static int +cleanup_filters(struct archive_read *a) +{ + int r = ARCHIVE_OK; + /* Clean up the filter pipeline. */ + while (a->filter != NULL) { + struct archive_read_filter *t = a->filter->upstream; + if (a->filter->close != NULL) { + int r1 = (a->filter->close)(a->filter); + if (r1 < r) + r = r1; + } + free(a->filter->buffer); + free(a->filter); + a->filter = t; + } + return r; +} + +/* + * Release memory and other resources. + */ +static int +_archive_read_finish(struct archive *_a) +{ + struct archive_read *a = (struct archive_read *)_a; + int i; + int slots; + int r = ARCHIVE_OK; + + __archive_check_magic(_a, ARCHIVE_READ_MAGIC, ARCHIVE_STATE_ANY, + "archive_read_finish"); + if (a->archive.state != ARCHIVE_STATE_CLOSED) + r = archive_read_close(&a->archive); + + /* Cleanup format-specific data. */ + slots = sizeof(a->formats) / sizeof(a->formats[0]); + for (i = 0; i < slots; i++) { + a->format = &(a->formats[i]); + if (a->formats[i].cleanup) + (a->formats[i].cleanup)(a); + } + + archive_string_free(&a->archive.error_string); + if (a->entry) + archive_entry_free(a->entry); + a->archive.magic = 0; + free(a); +#if ARCHIVE_API_VERSION > 1 + return (r); +#endif +} + +/* + * Used internally by read format handlers to register their bid and + * initialization functions. + */ +int +__archive_read_register_format(struct archive_read *a, + void *format_data, + const char *name, + int (*bid)(struct archive_read *), + int (*options)(struct archive_read *, const char *, const char *), + int (*read_header)(struct archive_read *, struct archive_entry *), + int (*read_data)(struct archive_read *, const void **, size_t *, off_t *), + int (*read_data_skip)(struct archive_read *), + int (*cleanup)(struct archive_read *)) +{ + int i, number_slots; + + __archive_check_magic(&a->archive, + ARCHIVE_READ_MAGIC, ARCHIVE_STATE_NEW, + "__archive_read_register_format"); + + number_slots = sizeof(a->formats) / sizeof(a->formats[0]); + + for (i = 0; i < number_slots; i++) { + if (a->formats[i].bid == bid) + return (ARCHIVE_WARN); /* We've already installed */ + if (a->formats[i].bid == NULL) { + a->formats[i].bid = bid; + a->formats[i].options = options; + a->formats[i].read_header = read_header; + a->formats[i].read_data = read_data; + a->formats[i].read_data_skip = read_data_skip; + a->formats[i].cleanup = cleanup; + a->formats[i].data = format_data; + a->formats[i].name = name; + return (ARCHIVE_OK); + } + } + + __archive_errx(1, "Not enough slots for format registration"); + return (ARCHIVE_FATAL); /* Never actually called. */ +} + +/* + * Used internally by decompression routines to register their bid and + * initialization functions. + */ +struct archive_read_filter_bidder * +__archive_read_get_bidder(struct archive_read *a) +{ + int i, number_slots; + + __archive_check_magic(&a->archive, + ARCHIVE_READ_MAGIC, ARCHIVE_STATE_NEW, + "__archive_read_get_bidder"); + + number_slots = sizeof(a->bidders) / sizeof(a->bidders[0]); + + for (i = 0; i < number_slots; i++) { + if (a->bidders[i].bid == NULL) { + memset(a->bidders + i, 0, sizeof(a->bidders[0])); + return (a->bidders + i); + } + } + + __archive_errx(1, "Not enough slots for compression registration"); + return (NULL); /* Never actually executed. */ +} + +/* + * The next three functions comprise the peek/consume internal I/O + * system used by archive format readers. This system allows fairly + * flexible read-ahead and allows the I/O code to operate in a + * zero-copy manner most of the time. + * + * In the ideal case, filters generate blocks of data + * and __archive_read_ahead() just returns pointers directly into + * those blocks. Then __archive_read_consume() just bumps those + * pointers. Only if your request would span blocks does the I/O + * layer use a copy buffer to provide you with a contiguous block of + * data. The __archive_read_skip() is an optimization; it scans ahead + * very quickly (it usually translates into a seek() operation if + * you're reading uncompressed disk files). + * + * A couple of useful idioms: + * * "I just want some data." Ask for 1 byte and pay attention to + * the "number of bytes available" from __archive_read_ahead(). + * You can consume more than you asked for; you just can't consume + * more than is available. If you consume everything that's + * immediately available, the next read_ahead() call will pull + * the next block. + * * "I want to output a large block of data." As above, ask for 1 byte, + * emit all that's available (up to whatever limit you have), then + * repeat until you're done. + * * "I want to peek ahead by a large amount." Ask for 4k or so, then + * double and repeat until you get an error or have enough. Note + * that the I/O layer will likely end up expanding its copy buffer + * to fit your request, so use this technique cautiously. This + * technique is used, for example, by some of the format tasting + * code that has uncertain look-ahead needs. + * + * TODO: Someday, provide a more generic __archive_read_seek() for + * those cases where it's useful. This is tricky because there are lots + * of cases where seek() is not available (reading gzip data from a + * network socket, for instance), so there needs to be a good way to + * communicate whether seek() is available and users of that interface + * need to use non-seeking strategies whenever seek() is not available. + */ + +/* + * Looks ahead in the input stream: + * * If 'avail' pointer is provided, that returns number of bytes available + * in the current buffer, which may be much larger than requested. + * * If end-of-file, *avail gets set to zero. + * * If error, *avail gets error code. + * * If request can be met, returns pointer to data, returns NULL + * if request is not met. + * + * Note: If you just want "some data", ask for 1 byte and pay attention + * to *avail, which will have the actual amount available. If you + * know exactly how many bytes you need, just ask for that and treat + * a NULL return as an error. + * + * Important: This does NOT move the file pointer. See + * __archive_read_consume() below. + */ + +/* + * This is tricky. We need to provide our clients with pointers to + * contiguous blocks of memory but we want to avoid copying whenever + * possible. + * + * Mostly, this code returns pointers directly into the block of data + * provided by the client_read routine. It can do this unless the + * request would split across blocks. In that case, we have to copy + * into an internal buffer to combine reads. + */ +const void * +__archive_read_ahead(struct archive_read *a, size_t min, ssize_t *avail) +{ + return (__archive_read_filter_ahead(a->filter, min, avail)); +} + +const void * +__archive_read_filter_ahead(struct archive_read_filter *filter, + size_t min, ssize_t *avail) +{ + ssize_t bytes_read; + size_t tocopy; + + if (filter->fatal) { + if (avail) + *avail = ARCHIVE_FATAL; + return (NULL); + } + + /* + * Keep pulling more data until we can satisfy the request. + */ + for (;;) { + + /* + * If we can satisfy from the copy buffer (and the + * copy buffer isn't empty), we're done. In particular, + * note that min == 0 is a perfectly well-defined + * request. + */ + if (filter->avail >= min && filter->avail > 0) { + if (avail != NULL) + *avail = filter->avail; + return (filter->next); + } + + /* + * We can satisfy directly from client buffer if everything + * currently in the copy buffer is still in the client buffer. + */ + if (filter->client_total >= filter->client_avail + filter->avail + && filter->client_avail + filter->avail >= min) { + /* "Roll back" to client buffer. */ + filter->client_avail += filter->avail; + filter->client_next -= filter->avail; + /* Copy buffer is now empty. */ + filter->avail = 0; + filter->next = filter->buffer; + /* Return data from client buffer. */ + if (avail != NULL) + *avail = filter->client_avail; + return (filter->client_next); + } + + /* Move data forward in copy buffer if necessary. */ + if (filter->next > filter->buffer && + filter->next + min > filter->buffer + filter->buffer_size) { + if (filter->avail > 0) + memmove(filter->buffer, filter->next, filter->avail); + filter->next = filter->buffer; + } + + /* If we've used up the client data, get more. */ + if (filter->client_avail <= 0) { + if (filter->end_of_file) { + if (avail != NULL) + *avail = 0; + return (NULL); + } + bytes_read = (filter->read)(filter, + &filter->client_buff); + if (bytes_read < 0) { /* Read error. */ + filter->client_total = filter->client_avail = 0; + filter->client_next = filter->client_buff = NULL; + filter->fatal = 1; + if (avail != NULL) + *avail = ARCHIVE_FATAL; + return (NULL); + } + if (bytes_read == 0) { /* Premature end-of-file. */ + filter->client_total = filter->client_avail = 0; + filter->client_next = filter->client_buff = NULL; + filter->end_of_file = 1; + /* Return whatever we do have. */ + if (avail != NULL) + *avail = filter->avail; + return (NULL); + } + filter->position += bytes_read; + filter->client_total = bytes_read; + filter->client_avail = filter->client_total; + filter->client_next = filter->client_buff; + } + else + { + /* + * We can't satisfy the request from the copy + * buffer or the existing client data, so we + * need to copy more client data over to the + * copy buffer. + */ + + /* Ensure the buffer is big enough. */ + if (min > filter->buffer_size) { + size_t s, t; + char *p; + + /* Double the buffer; watch for overflow. */ + s = t = filter->buffer_size; + if (s == 0) + s = min; + while (s < min) { + t *= 2; + if (t <= s) { /* Integer overflow! */ + archive_set_error( + &filter->archive->archive, + ENOMEM, + "Unable to allocate copy buffer"); + filter->fatal = 1; + if (avail != NULL) + *avail = ARCHIVE_FATAL; + return (NULL); + } + s = t; + } + /* Now s >= min, so allocate a new buffer. */ + p = (char *)malloc(s); + if (p == NULL) { + archive_set_error( + &filter->archive->archive, + ENOMEM, + "Unable to allocate copy buffer"); + filter->fatal = 1; + if (avail != NULL) + *avail = ARCHIVE_FATAL; + return (NULL); + } + /* Move data into newly-enlarged buffer. */ + if (filter->avail > 0) + memmove(p, filter->next, filter->avail); + free(filter->buffer); + filter->next = filter->buffer = p; + filter->buffer_size = s; + } + + /* We can add client data to copy buffer. */ + /* First estimate: copy to fill rest of buffer. */ + tocopy = (filter->buffer + filter->buffer_size) + - (filter->next + filter->avail); + /* Don't waste time buffering more than we need to. */ + if (tocopy + filter->avail > min) + tocopy = min - filter->avail; + /* Don't copy more than is available. */ + if (tocopy > filter->client_avail) + tocopy = filter->client_avail; + + memcpy(filter->next + filter->avail, filter->client_next, + tocopy); + /* Remove this data from client buffer. */ + filter->client_next += tocopy; + filter->client_avail -= tocopy; + /* add it to copy buffer. */ + filter->avail += tocopy; + } + } +} + +/* + * Move the file pointer forward. This should be called after + * __archive_read_ahead() returns data to you. Don't try to move + * ahead by more than the amount of data available according to + * __archive_read_ahead(). + */ +/* + * Mark the appropriate data as used. Note that the request here will + * often be much smaller than the size of the previous read_ahead + * request. + */ +ssize_t +__archive_read_consume(struct archive_read *a, size_t request) +{ + ssize_t r; + r = __archive_read_filter_consume(a->filter, request); + a->archive.file_position += r; + return (r); +} + +ssize_t +__archive_read_filter_consume(struct archive_read_filter * filter, + size_t request) +{ + if (filter->avail > 0) { + /* Read came from copy buffer. */ + filter->next += request; + filter->avail -= request; + } else { + /* Read came from client buffer. */ + filter->client_next += request; + filter->client_avail -= request; + } + return (request); +} + +/* + * Move the file pointer ahead by an arbitrary amount. If you're + * reading uncompressed data from a disk file, this will actually + * translate into a seek() operation. Even in cases where seek() + * isn't feasible, this at least pushes the read-and-discard loop + * down closer to the data source. + */ +#ifndef __minix +int64_t +__archive_read_skip(struct archive_read *a, int64_t request) +{ + int64_t skipped = __archive_read_skip_lenient(a, request); + if (skipped == request) + return (skipped); + /* We hit EOF before we satisfied the skip request. */ + if (skipped < 0) // Map error code to 0 for error message below. + skipped = 0; + archive_set_error(&a->archive, + ARCHIVE_ERRNO_MISC, + "Truncated input file (needed %jd bytes, only %jd available)", + (intmax_t)request, (intmax_t)skipped); + return (ARCHIVE_FATAL); +} +#else +ssize_t +__archive_read_skip(struct archive_read *a, ssize_t request) +{ + size_t skipped = __archive_read_skip_lenient(a, request); + if (skipped == request) + return (skipped); + /* We hit EOF before we satisfied the skip request. */ + if (skipped < 0) /* Map error code to 0 for error message below. */ + skipped = 0; + archive_set_error(&a->archive, + ARCHIVE_ERRNO_MISC, + "Truncated input file (needed %jd bytes, only %jd available)", + (intmax_t)request, (intmax_t)skipped); + return (ARCHIVE_FATAL); +} +#endif + +#ifndef __minix +int64_t +__archive_read_skip_lenient(struct archive_read *a, int64_t request) +{ + int64_t skipped = __archive_read_filter_skip(a->filter, request); + if (skipped > 0) + a->archive.file_position += skipped; + return (skipped); +} +#else +ssize_t +__archive_read_skip_lenient(struct archive_read *a, ssize_t request) +{ + size_t skipped = __archive_read_filter_skip(a->filter, request); + if (skipped > 0) + a->archive.file_position += skipped; + return (skipped); +} +#endif + +#ifndef __minix +int64_t +__archive_read_filter_skip(struct archive_read_filter *filter, int64_t request) +{ + int64_t bytes_skipped, total_bytes_skipped = 0; + size_t min; + + if (filter->fatal) + return (-1); + /* + * If there is data in the buffers already, use that first. + */ + if (filter->avail > 0) { + min = minimum(request, (off_t)filter->avail); + bytes_skipped = __archive_read_filter_consume(filter, min); + request -= bytes_skipped; + total_bytes_skipped += bytes_skipped; + } + if (filter->client_avail > 0) { + min = minimum(request, (int64_t)filter->client_avail); + bytes_skipped = __archive_read_filter_consume(filter, min); + request -= bytes_skipped; + total_bytes_skipped += bytes_skipped; + } + if (request == 0) + return (total_bytes_skipped); + /* + * If a client_skipper was provided, try that first. + */ +#if ARCHIVE_API_VERSION < 2 + if ((filter->skip != NULL) && (request < SSIZE_MAX)) { +#else + if (filter->skip != NULL) { +#endif + bytes_skipped = (filter->skip)(filter, request); + if (bytes_skipped < 0) { /* error */ + filter->client_total = filter->client_avail = 0; + filter->client_next = filter->client_buff = NULL; + filter->fatal = 1; + return (bytes_skipped); + } + total_bytes_skipped += bytes_skipped; + request -= bytes_skipped; + filter->client_next = filter->client_buff; + filter->client_avail = filter->client_total = 0; + } + /* + * Note that client_skipper will usually not satisfy the + * full request (due to low-level blocking concerns), + * so even if client_skipper is provided, we may still + * have to use ordinary reads to finish out the request. + */ + while (request > 0) { + ssize_t bytes_read; + (void)__archive_read_filter_ahead(filter, 1, &bytes_read); + if (bytes_read < 0) + return (bytes_read); + if (bytes_read == 0) { + return (total_bytes_skipped); + } + min = (size_t)(minimum(bytes_read, request)); + bytes_read = __archive_read_filter_consume(filter, min); + total_bytes_skipped += bytes_read; + request -= bytes_read; + } + return (total_bytes_skipped); +} +#else +ssize_t +__archive_read_filter_skip(struct archive_read_filter *filter, ssize_t request) +{ + size_t bytes_skipped, total_bytes_skipped = 0; + size_t min; + + if (filter->fatal) + return (-1); + /* + * If there is data in the buffers already, use that first. + */ + if (filter->avail > 0) { + min = minimum(request, (off_t)filter->avail); + bytes_skipped = __archive_read_filter_consume(filter, min); + request -= bytes_skipped; + total_bytes_skipped += bytes_skipped; + } + if (filter->client_avail > 0) { + min = minimum(request, (off_t)filter->client_avail); + bytes_skipped = __archive_read_filter_consume(filter, min); + request -= bytes_skipped; + total_bytes_skipped += bytes_skipped; + } + if (request == 0) + return (total_bytes_skipped); + /* + * If a client_skipper was provided, try that first. + */ +#if ARCHIVE_API_VERSION < 2 + if ((filter->skip != NULL) && (request < SSIZE_MAX)) { +#else + if (filter->skip != NULL) { +#endif + bytes_skipped = (filter->skip)(filter, request); + if (bytes_skipped < 0) { /* error */ + filter->client_total = filter->client_avail = 0; + filter->client_next = filter->client_buff = NULL; + filter->fatal = 1; + return (bytes_skipped); + } + total_bytes_skipped += bytes_skipped; + request -= bytes_skipped; + filter->client_next = filter->client_buff; + filter->client_avail = filter->client_total = 0; + } + /* + * Note that client_skipper will usually not satisfy the + * full request (due to low-level blocking concerns), + * so even if client_skipper is provided, we may still + * have to use ordinary reads to finish out the request. + */ + while (request > 0) { + ssize_t bytes_read; + (void)__archive_read_filter_ahead(filter, 1, &bytes_read); + if (bytes_read < 0) + return (bytes_read); + if (bytes_read == 0) { + return (total_bytes_skipped); + } + min = (size_t)(minimum(bytes_read, request)); + bytes_read = __archive_read_filter_consume(filter, min); + total_bytes_skipped += bytes_read; + request -= bytes_read; + } + return (total_bytes_skipped); +} +#endif diff --git a/lib/libarchive/archive_read_data_into_fd.c b/lib/libarchive/archive_read_data_into_fd.c new file mode 100644 index 000000000..3aeef3bc2 --- /dev/null +++ b/lib/libarchive/archive_read_data_into_fd.c @@ -0,0 +1,93 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: src/lib/libarchive/archive_read_data_into_fd.c,v 1.16 2008/05/23 05:01:29 cperciva Exp $"); + +#ifdef HAVE_SYS_TYPES_H +#include +#endif +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_UNISTD_H +#include +#endif + +#include "archive.h" +#include "archive_private.h" + +/* Maximum amount of data to write at one time. */ +#define MAX_WRITE (1024 * 1024) + +/* + * This implementation minimizes copying of data and is sparse-file aware. + */ +int +archive_read_data_into_fd(struct archive *a, int fd) +{ + int r; + const void *buff; + size_t size, bytes_to_write; + ssize_t bytes_written, total_written; + off_t offset; + off_t output_offset; + + __archive_check_magic(a, ARCHIVE_READ_MAGIC, ARCHIVE_STATE_DATA, "archive_read_data_into_fd"); + + total_written = 0; + output_offset = 0; + + while ((r = archive_read_data_block(a, &buff, &size, &offset)) == + ARCHIVE_OK) { + const char *p = buff; + if (offset > output_offset) { + output_offset = lseek(fd, + offset - output_offset, SEEK_CUR); + if (output_offset != offset) { + archive_set_error(a, errno, "Seek error"); + return (ARCHIVE_FATAL); + } + } + while (size > 0) { + bytes_to_write = size; + if (bytes_to_write > MAX_WRITE) + bytes_to_write = MAX_WRITE; + bytes_written = write(fd, p, bytes_to_write); + if (bytes_written < 0) { + archive_set_error(a, errno, "Write error"); + return (ARCHIVE_FATAL); + } + output_offset += bytes_written; + total_written += bytes_written; + p += bytes_written; + size -= bytes_written; + } + } + + if (r != ARCHIVE_EOF) + return (r); + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_read_disk.3 b/lib/libarchive/archive_read_disk.3 new file mode 100644 index 000000000..b3a09b528 --- /dev/null +++ b/lib/libarchive/archive_read_disk.3 @@ -0,0 +1,308 @@ +.\" Copyright (c) 2003-2009 Tim Kientzle +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD: head/lib/libarchive/archive_read_disk.3 190957 2009-04-12 05:04:02Z kientzle $ +.\" +.Dd March 10, 2009 +.Dt archive_read_disk 3 +.Os +.Sh NAME +.Nm archive_read_disk_new , +.Nm archive_read_disk_set_symlink_logical , +.Nm archive_read_disk_set_symlink_physical , +.Nm archive_read_disk_set_symlink_hybrid , +.Nm archive_read_disk_entry_from_file , +.Nm archive_read_disk_gname , +.Nm archive_read_disk_uname , +.Nm archive_read_disk_set_uname_lookup , +.Nm archive_read_disk_set_gname_lookup , +.Nm archive_read_disk_set_standard_lookup , +.Nm archive_read_close , +.Nm archive_read_finish +.Nd functions for reading objects from disk +.Sh SYNOPSIS +.In archive.h +.Ft struct archive * +.Fn archive_read_disk_new "void" +.Ft int +.Fn archive_read_disk_set_symlink_logical "struct archive *" +.Ft int +.Fn archive_read_disk_set_symlink_physical "struct archive *" +.Ft int +.Fn archive_read_disk_set_symlink_hybrid "struct archive *" +.Ft int +.Fn archive_read_disk_gname "struct archive *" "gid_t" +.Ft int +.Fn archive_read_disk_uname "struct archive *" "uid_t" +.Ft int +.Fo archive_read_disk_set_gname_lookup +.Fa "struct archive *" +.Fa "void *" +.Fa "const char *(*lookup)(void *, gid_t)" +.Fa "void (*cleanup)(void *)" +.Fc +.Ft int +.Fo archive_read_disk_set_uname_lookup +.Fa "struct archive *" +.Fa "void *" +.Fa "const char *(*lookup)(void *, uid_t)" +.Fa "void (*cleanup)(void *)" +.Fc +.Ft int +.Fn archive_read_disk_set_standard_lookup "struct archive *" +.Ft int +.Fo archive_read_disk_entry_from_file +.Fa "struct archive *" +.Fa "struct archive_entry *" +.Fa "int fd" +.Fa "const struct stat *" +.Fc +.Ft int +.Fn archive_read_close "struct archive *" +.Ft int +.Fn archive_read_finish "struct archive *" +.Sh DESCRIPTION +These functions provide an API for reading information about +objects on disk. +In particular, they provide an interface for populating +.Tn struct archive_entry +objects. +.Bl -tag -width indent +.It Fn archive_read_disk_new +Allocates and initializes a +.Tn struct archive +object suitable for reading object information from disk. +.It Xo +.Fn archive_read_disk_set_symlink_logical , +.Fn archive_read_disk_set_symlink_physical , +.Fn archive_read_disk_set_symlink_hybrid +.Xc +This sets the mode used for handling symbolic links. +The +.Dq logical +mode follows all symbolic links. +The +.Dq physical +mode does not follow any symbolic links. +The +.Dq hybrid +mode currently behaves identically to the +.Dq logical +mode. +.It Xo +.Fn archive_read_disk_gname , +.Fn archive_read_disk_uname +.Xc +Returns a user or group name given a gid or uid value. +By default, these always return a NULL string. +.It Xo +.Fn archive_read_disk_set_gname_lookup , +.Fn archive_read_disk_set_uname_lookup +.Xc +These allow you to override the functions used for +user and group name lookups. +You may also provide a +.Tn void * +pointer to a private data structure and a cleanup function for +that data. +The cleanup function will be invoked when the +.Tn struct archive +object is destroyed or when new lookup functions are registered. +.It Fn archive_read_disk_set_standard_lookup +This convenience function installs a standard set of user +and group name lookup functions. +These functions use +.Xr getpwid 3 +and +.Xr getgrid 3 +to convert ids to names, defaulting to NULL if the names cannot +be looked up. +These functions also implement a simple memory cache to reduce +the number of calls to +.Xr getpwid 3 +and +.Xr getgrid 3 . +.It Fn archive_read_disk_entry_from_file +Populates a +.Tn struct archive_entry +object with information about a particular file. +The +.Tn archive_entry +object must have already been created with +.Xr archive_entry_new 3 +and at least one of the source path or path fields must already be set. +(If both are set, the source path will be used.) +.Pp +Information is read from disk using the path name from the +.Tn struct archive_entry +object. +If a file descriptor is provided, some information will be obtained using +that file descriptor, on platforms that support the appropriate +system calls. +.Pp +If a pointer to a +.Tn struct stat +is provided, information from that structure will be used instead +of reading from the disk where appropriate. +This can provide performance benefits in scenarios where +.Tn struct stat +information has already been read from the disk as a side effect +of some other operation. +(For example, directory traversal libraries often provide this information.) +.Pp +Where necessary, user and group ids are converted to user and group names +using the currently registered lookup functions above. +This affects the file ownership fields and ACL values in the +.Tn struct archive_entry +object. +.It Fn archive_read_close +This currently does nothing. +.It Fn archive_write_finish +Invokes +.Fn archive_write_close +if it was not invoked manually, then releases all resources. +.El +More information about the +.Va struct archive +object and the overall design of the library can be found in the +.Xr libarchive 3 +overview. +.Sh EXAMPLE +The following illustrates basic usage of the library by +showing how to use it to copy an item on disk into an archive. +.Bd -literal -offset indent +void +file_to_archive(struct archive *a, const char *name) +{ + char buff[8192]; + size_t bytes_read; + struct archive *ard; + struct archive_entry *entry; + int fd; + + ard = archive_read_disk_new(); + archive_read_disk_set_standard_lookup(ard); + entry = archive_entry_new(); + fd = open(name, O_RDONLY); + if (fd < 0) + return; + archive_entry_copy_sourcepath(entry, name); + archive_read_disk_entry_from_file(ard, entry, fd, NULL); + archive_write_header(a, entry); + while ((bytes_read = read(fd, buff, sizeof(buff))) > 0) + archive_write_data(a, buff, bytes_read); + archive_write_finish_entry(a); + archive_read_finish(ard); + archive_entry_free(entry); +} +.Ed +.Sh RETURN VALUES +Most functions return +.Cm ARCHIVE_OK +(zero) on success, or one of several negative +error codes for errors. +Specific error codes include: +.Cm ARCHIVE_RETRY +for operations that might succeed if retried, +.Cm ARCHIVE_WARN +for unusual conditions that do not prevent further operations, and +.Cm ARCHIVE_FATAL +for serious errors that make remaining operations impossible. +The +.Xr archive_errno 3 +and +.Xr archive_error_string 3 +functions can be used to retrieve an appropriate error code and a +textual error message. +(See +.Xr archive_util 3 +for details.) +.Pp +.Fn archive_read_disk_new +returns a pointer to a newly-allocated +.Tn struct archive +object or NULL if the allocation failed for any reason. +.Pp +.Fn archive_read_disk_gname +and +.Fn archive_read_disk_uname +return +.Tn const char * +pointers to the textual name or NULL if the lookup failed for any reason. +The returned pointer points to internal storage that +may be reused on the next call to either of these functions; +callers should copy the string if they need to continue accessing it. +.Pp +.Sh SEE ALSO +.Xr archive_read 3 , +.Xr archive_write 3 , +.Xr archive_write_disk 3 , +.Xr tar 1 , +.Xr libarchive 3 +.Sh HISTORY +The +.Nm libarchive +library first appeared in +.Fx 5.3 . +The +.Nm archive_read_disk +interface was added to +.Nm libarchive 2.6 +and first appeared in +.Fx 8.0 . +.Sh AUTHORS +.An -nosplit +The +.Nm libarchive +library was written by +.An Tim Kientzle Aq kientzle@freebsd.org . +.Sh BUGS +The +.Dq standard +user name and group name lookup functions are not the defaults because +.Xr getgrid 3 +and +.Xr getpwid 3 +are sometimes too large for particular applications. +The current design allows the application author to use a more +compact implementation when appropriate. +.Pp +The full list of metadata read from disk by +.Fn archive_read_disk_entry_from_file +is necessarily system-dependent. +.Pp +The +.Fn archive_read_disk_entry_from_file +function reads as much information as it can from disk. +Some method should be provided to limit this so that clients who +do not need ACLs, for instance, can avoid the extra work needed +to look up such information. +.Pp +This API should provide a set of methods for walking a directory tree. +That would make it a direct parallel of the +.Xr archive_read 3 +API. +When such methods are implemented, the +.Dq hybrid +symbolic link mode will make sense. diff --git a/lib/libarchive/archive_read_disk.c b/lib/libarchive/archive_read_disk.c new file mode 100644 index 000000000..8fad7f137 --- /dev/null +++ b/lib/libarchive/archive_read_disk.c @@ -0,0 +1,198 @@ +/*- + * Copyright (c) 2003-2009 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_disk.c 189429 2009-03-06 04:35:31Z kientzle $"); + +#include "archive.h" +#include "archive_string.h" +#include "archive_entry.h" +#include "archive_private.h" +#include "archive_read_disk_private.h" + +static int _archive_read_finish(struct archive *); +static int _archive_read_close(struct archive *); +static const char *trivial_lookup_gname(void *, gid_t gid); +static const char *trivial_lookup_uname(void *, uid_t uid); + +static struct archive_vtable * +archive_read_disk_vtable(void) +{ + static struct archive_vtable av; + static int inited = 0; + + if (!inited) { + av.archive_finish = _archive_read_finish; + av.archive_close = _archive_read_close; + } + return (&av); +} + +const char * +archive_read_disk_gname(struct archive *_a, gid_t gid) +{ + struct archive_read_disk *a = (struct archive_read_disk *)_a; + if (a->lookup_gname != NULL) + return ((*a->lookup_gname)(a->lookup_gname_data, gid)); + return (NULL); +} + +const char * +archive_read_disk_uname(struct archive *_a, uid_t uid) +{ + struct archive_read_disk *a = (struct archive_read_disk *)_a; + if (a->lookup_uname != NULL) + return ((*a->lookup_uname)(a->lookup_uname_data, uid)); + return (NULL); +} + +int +archive_read_disk_set_gname_lookup(struct archive *_a, + void *private_data, + const char * (*lookup_gname)(void *private, gid_t gid), + void (*cleanup_gname)(void *private)) +{ + struct archive_read_disk *a = (struct archive_read_disk *)_a; + __archive_check_magic(&a->archive, ARCHIVE_READ_DISK_MAGIC, + ARCHIVE_STATE_ANY, "archive_read_disk_set_gname_lookup"); + + if (a->cleanup_gname != NULL && a->lookup_gname_data != NULL) + (a->cleanup_gname)(a->lookup_gname_data); + + a->lookup_gname = lookup_gname; + a->cleanup_gname = cleanup_gname; + a->lookup_gname_data = private_data; + return (ARCHIVE_OK); +} + +int +archive_read_disk_set_uname_lookup(struct archive *_a, + void *private_data, + const char * (*lookup_uname)(void *private, uid_t uid), + void (*cleanup_uname)(void *private)) +{ + struct archive_read_disk *a = (struct archive_read_disk *)_a; + __archive_check_magic(&a->archive, ARCHIVE_READ_DISK_MAGIC, + ARCHIVE_STATE_ANY, "archive_read_disk_set_uname_lookup"); + + if (a->cleanup_uname != NULL && a->lookup_uname_data != NULL) + (a->cleanup_uname)(a->lookup_uname_data); + + a->lookup_uname = lookup_uname; + a->cleanup_uname = cleanup_uname; + a->lookup_uname_data = private_data; + return (ARCHIVE_OK); +} + +/* + * Create a new archive_read_disk object and initialize it with global state. + */ +struct archive * +archive_read_disk_new(void) +{ + struct archive_read_disk *a; + + a = (struct archive_read_disk *)malloc(sizeof(*a)); + if (a == NULL) + return (NULL); + memset(a, 0, sizeof(*a)); + a->archive.magic = ARCHIVE_READ_DISK_MAGIC; + /* We're ready to write a header immediately. */ + a->archive.state = ARCHIVE_STATE_HEADER; + a->archive.vtable = archive_read_disk_vtable(); + a->lookup_uname = trivial_lookup_uname; + a->lookup_gname = trivial_lookup_gname; + return (&a->archive); +} + +static int +_archive_read_finish(struct archive *_a) +{ + struct archive_read_disk *a = (struct archive_read_disk *)_a; + + if (a->cleanup_gname != NULL && a->lookup_gname_data != NULL) + (a->cleanup_gname)(a->lookup_gname_data); + if (a->cleanup_uname != NULL && a->lookup_uname_data != NULL) + (a->cleanup_uname)(a->lookup_uname_data); + archive_string_free(&a->archive.error_string); + free(a); + return (ARCHIVE_OK); +} + +static int +_archive_read_close(struct archive *_a) +{ + (void)_a; /* UNUSED */ + return (ARCHIVE_OK); +} + +int +archive_read_disk_set_symlink_logical(struct archive *_a) +{ + struct archive_read_disk *a = (struct archive_read_disk *)_a; + a->symlink_mode = 'L'; + a->follow_symlinks = 1; + return (ARCHIVE_OK); +} + +int +archive_read_disk_set_symlink_physical(struct archive *_a) +{ + struct archive_read_disk *a = (struct archive_read_disk *)_a; + a->symlink_mode = 'P'; + a->follow_symlinks = 0; + return (ARCHIVE_OK); +} + +int +archive_read_disk_set_symlink_hybrid(struct archive *_a) +{ + struct archive_read_disk *a = (struct archive_read_disk *)_a; + a->symlink_mode = 'H'; + a->follow_symlinks = 1; /* Follow symlinks initially. */ + return (ARCHIVE_OK); +} + +/* + * Trivial implementations of gname/uname lookup functions. + * These are normally overridden by the client, but these stub + * versions ensure that we always have something that works. + */ +static const char * +trivial_lookup_gname(void *private_data, gid_t gid) +{ + (void)private_data; /* UNUSED */ + (void)gid; /* UNUSED */ + return (NULL); +} + +static const char * +trivial_lookup_uname(void *private_data, uid_t uid) +{ + (void)private_data; /* UNUSED */ + (void)uid; /* UNUSED */ + return (NULL); +} diff --git a/lib/libarchive/archive_read_disk_entry_from_file.c b/lib/libarchive/archive_read_disk_entry_from_file.c new file mode 100644 index 000000000..6acc09662 --- /dev/null +++ b/lib/libarchive/archive_read_disk_entry_from_file.c @@ -0,0 +1,569 @@ +/*- + * Copyright (c) 2003-2009 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_disk_entry_from_file.c 201084 2009-12-28 02:14:09Z kientzle $"); + +#ifdef HAVE_SYS_TYPES_H +/* Mac OSX requires sys/types.h before sys/acl.h. */ +#include +#endif +#ifdef HAVE_SYS_ACL_H +#include +#endif +#ifdef HAVE_SYS_EXTATTR_H +#include +#endif +#ifdef HAVE_SYS_PARAM_H +#include +#endif +#ifdef HAVE_SYS_STAT_H +#include +#endif +#ifdef HAVE_SYS_XATTR_H +#include +#endif +#ifdef HAVE_ACL_LIBACL_H +#include +#endif +#ifdef HAVE_ATTR_XATTR_H +#include +#endif +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_LIMITS_H +#include +#endif +#ifdef HAVE_WINDOWS_H +#include +#endif + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" +#include "archive_read_disk_private.h" + +/* + * Linux and FreeBSD plug this obvious hole in POSIX.1e in + * different ways. + */ +#if HAVE_ACL_GET_PERM +#define ACL_GET_PERM acl_get_perm +#elif HAVE_ACL_GET_PERM_NP +#define ACL_GET_PERM acl_get_perm_np +#endif + +static int setup_acls_posix1e(struct archive_read_disk *, + struct archive_entry *, int fd); +static int setup_xattrs(struct archive_read_disk *, + struct archive_entry *, int fd); + +int +archive_read_disk_entry_from_file(struct archive *_a, + struct archive_entry *entry, + int fd, const struct stat *st) +{ + struct archive_read_disk *a = (struct archive_read_disk *)_a; + const char *path, *name; + struct stat s; + int initial_fd = fd; + int r, r1; + + archive_clear_error(_a); + path = archive_entry_sourcepath(entry); + if (path == NULL) + path = archive_entry_pathname(entry); + +#ifdef EXT2_IOC_GETFLAGS + /* Linux requires an extra ioctl to pull the flags. Although + * this is an extra step, it has a nice side-effect: We get an + * open file descriptor which we can use in the subsequent lookups. */ + if ((S_ISREG(st->st_mode) || S_ISDIR(st->st_mode))) { + if (fd < 0) + fd = open(pathname, O_RDONLY | O_NONBLOCK | O_BINARY); + if (fd >= 0) { + unsigned long stflags; + int r = ioctl(fd, EXT2_IOC_GETFLAGS, &stflags); + if (r == 0 && stflags != 0) + archive_entry_set_fflags(entry, stflags, 0); + } + } +#endif + + if (st == NULL) { + /* TODO: On Windows, use GetFileInfoByHandle() here. + * Using Windows stat() call is badly broken, but + * even the stat() wrapper has problems because + * 'struct stat' is broken on Windows. + */ +#if HAVE_FSTAT + if (fd >= 0) { + if (fstat(fd, &s) != 0) { + archive_set_error(&a->archive, errno, + "Can't fstat"); + return (ARCHIVE_FAILED); + } + } else +#endif +#if HAVE_LSTAT + if (!a->follow_symlinks) { + if (lstat(path, &s) != 0) { + archive_set_error(&a->archive, errno, + "Can't lstat %s", path); + return (ARCHIVE_FAILED); + } + } else +#endif + if (stat(path, &s) != 0) { + archive_set_error(&a->archive, errno, + "Can't lstat %s", path); + return (ARCHIVE_FAILED); + } + st = &s; + } + archive_entry_copy_stat(entry, st); + + /* Lookup uname/gname */ + name = archive_read_disk_uname(_a, archive_entry_uid(entry)); + if (name != NULL) + archive_entry_copy_uname(entry, name); + name = archive_read_disk_gname(_a, archive_entry_gid(entry)); + if (name != NULL) + archive_entry_copy_gname(entry, name); + +#ifdef HAVE_STRUCT_STAT_ST_FLAGS + /* On FreeBSD, we get flags for free with the stat. */ + /* TODO: Does this belong in copy_stat()? */ + if (st->st_flags != 0) + archive_entry_set_fflags(entry, st->st_flags, 0); +#endif + +#ifdef HAVE_READLINK + if (S_ISLNK(st->st_mode)) { + char linkbuffer[PATH_MAX + 1]; + int lnklen = readlink(path, linkbuffer, PATH_MAX); + if (lnklen < 0) { + archive_set_error(&a->archive, errno, + "Couldn't read link data"); + return (ARCHIVE_FAILED); + } + linkbuffer[lnklen] = 0; + archive_entry_set_symlink(entry, linkbuffer); + } +#endif + + r = setup_acls_posix1e(a, entry, fd); + r1 = setup_xattrs(a, entry, fd); + if (r1 < r) + r = r1; + /* If we opened the file earlier in this function, close it. */ + if (initial_fd != fd) + close(fd); + return (r); +} + +#ifdef HAVE_POSIX_ACL +static void setup_acl_posix1e(struct archive_read_disk *a, + struct archive_entry *entry, acl_t acl, int archive_entry_acl_type); + +static int +setup_acls_posix1e(struct archive_read_disk *a, + struct archive_entry *entry, int fd) +{ + const char *accpath; + acl_t acl; + + accpath = archive_entry_sourcepath(entry); + if (accpath == NULL) + accpath = archive_entry_pathname(entry); + + archive_entry_acl_clear(entry); + + /* Retrieve access ACL from file. */ + if (fd >= 0) + acl = acl_get_fd(fd); +#if HAVE_ACL_GET_LINK_NP + else if (!a->follow_symlinks) + acl = acl_get_link_np(accpath, ACL_TYPE_ACCESS); +#else + else if ((!a->follow_symlinks) + && (archive_entry_filetype(entry) == AE_IFLNK)) + /* We can't get the ACL of a symlink, so we assume it can't + have one. */ + acl = NULL; +#endif + else + acl = acl_get_file(accpath, ACL_TYPE_ACCESS); + if (acl != NULL) { + setup_acl_posix1e(a, entry, acl, + ARCHIVE_ENTRY_ACL_TYPE_ACCESS); + acl_free(acl); + } + + /* Only directories can have default ACLs. */ + if (S_ISDIR(archive_entry_mode(entry))) { + acl = acl_get_file(accpath, ACL_TYPE_DEFAULT); + if (acl != NULL) { + setup_acl_posix1e(a, entry, acl, + ARCHIVE_ENTRY_ACL_TYPE_DEFAULT); + acl_free(acl); + } + } + return (ARCHIVE_OK); +} + +/* + * Translate POSIX.1e ACL into libarchive internal structure. + */ +static void +setup_acl_posix1e(struct archive_read_disk *a, + struct archive_entry *entry, acl_t acl, int archive_entry_acl_type) +{ + acl_tag_t acl_tag; + acl_entry_t acl_entry; + acl_permset_t acl_permset; + int s, ae_id, ae_tag, ae_perm; + const char *ae_name; + + s = acl_get_entry(acl, ACL_FIRST_ENTRY, &acl_entry); + while (s == 1) { + ae_id = -1; + ae_name = NULL; + + acl_get_tag_type(acl_entry, &acl_tag); + if (acl_tag == ACL_USER) { + ae_id = (int)*(uid_t *)acl_get_qualifier(acl_entry); + ae_name = archive_read_disk_uname(&a->archive, ae_id); + ae_tag = ARCHIVE_ENTRY_ACL_USER; + } else if (acl_tag == ACL_GROUP) { + ae_id = (int)*(gid_t *)acl_get_qualifier(acl_entry); + ae_name = archive_read_disk_gname(&a->archive, ae_id); + ae_tag = ARCHIVE_ENTRY_ACL_GROUP; + } else if (acl_tag == ACL_MASK) { + ae_tag = ARCHIVE_ENTRY_ACL_MASK; + } else if (acl_tag == ACL_USER_OBJ) { + ae_tag = ARCHIVE_ENTRY_ACL_USER_OBJ; + } else if (acl_tag == ACL_GROUP_OBJ) { + ae_tag = ARCHIVE_ENTRY_ACL_GROUP_OBJ; + } else if (acl_tag == ACL_OTHER) { + ae_tag = ARCHIVE_ENTRY_ACL_OTHER; + } else { + /* Skip types that libarchive can't support. */ + continue; + } + + acl_get_permset(acl_entry, &acl_permset); + ae_perm = 0; + /* + * acl_get_perm() is spelled differently on different + * platforms; see above. + */ + if (ACL_GET_PERM(acl_permset, ACL_EXECUTE)) + ae_perm |= ARCHIVE_ENTRY_ACL_EXECUTE; + if (ACL_GET_PERM(acl_permset, ACL_READ)) + ae_perm |= ARCHIVE_ENTRY_ACL_READ; + if (ACL_GET_PERM(acl_permset, ACL_WRITE)) + ae_perm |= ARCHIVE_ENTRY_ACL_WRITE; + + archive_entry_acl_add_entry(entry, + archive_entry_acl_type, ae_perm, ae_tag, + ae_id, ae_name); + + s = acl_get_entry(acl, ACL_NEXT_ENTRY, &acl_entry); + } +} +#else +static int +setup_acls_posix1e(struct archive_read_disk *a, + struct archive_entry *entry, int fd) +{ + (void)a; /* UNUSED */ + (void)entry; /* UNUSED */ + (void)fd; /* UNUSED */ + return (ARCHIVE_OK); +} +#endif + +#if HAVE_LISTXATTR && HAVE_LLISTXATTR && HAVE_GETXATTR && HAVE_LGETXATTR + +/* + * Linux extended attribute support. + * + * TODO: By using a stack-allocated buffer for the first + * call to getxattr(), we might be able to avoid the second + * call entirely. We only need the second call if the + * stack-allocated buffer is too small. But a modest buffer + * of 1024 bytes or so will often be big enough. Same applies + * to listxattr(). + */ + + +static int +setup_xattr(struct archive_read_disk *a, + struct archive_entry *entry, const char *name, int fd) +{ + ssize_t size; + void *value = NULL; + const char *accpath; + + (void)fd; /* UNUSED */ + + accpath = archive_entry_sourcepath(entry); + if (accpath == NULL) + accpath = archive_entry_pathname(entry); + + if (!a->follow_symlinks) + size = lgetxattr(accpath, name, NULL, 0); + else + size = getxattr(accpath, name, NULL, 0); + + if (size == -1) { + archive_set_error(&a->archive, errno, + "Couldn't query extended attribute"); + return (ARCHIVE_WARN); + } + + if (size > 0 && (value = malloc(size)) == NULL) { + archive_set_error(&a->archive, errno, "Out of memory"); + return (ARCHIVE_FATAL); + } + + if (!a->follow_symlinks) + size = lgetxattr(accpath, name, value, size); + else + size = getxattr(accpath, name, value, size); + + if (size == -1) { + archive_set_error(&a->archive, errno, + "Couldn't read extended attribute"); + return (ARCHIVE_WARN); + } + + archive_entry_xattr_add_entry(entry, name, value, size); + + free(value); + return (ARCHIVE_OK); +} + +static int +setup_xattrs(struct archive_read_disk *a, + struct archive_entry *entry, int fd) +{ + char *list, *p; + const char *path; + ssize_t list_size; + + + path = archive_entry_sourcepath(entry); + if (path == NULL) + path = archive_entry_pathname(entry); + + if (!a->follow_symlinks) + list_size = llistxattr(path, NULL, 0); + else + list_size = listxattr(path, NULL, 0); + + if (list_size == -1) { + if (errno == ENOTSUP) + return (ARCHIVE_OK); + archive_set_error(&a->archive, errno, + "Couldn't list extended attributes"); + return (ARCHIVE_WARN); + } + + if (list_size == 0) + return (ARCHIVE_OK); + + if ((list = malloc(list_size)) == NULL) { + archive_set_error(&a->archive, errno, "Out of memory"); + return (ARCHIVE_FATAL); + } + + if (!a->follow_symlinks) + list_size = llistxattr(path, list, list_size); + else + list_size = listxattr(path, list, list_size); + + if (list_size == -1) { + archive_set_error(&a->archive, errno, + "Couldn't retrieve extended attributes"); + free(list); + return (ARCHIVE_WARN); + } + + for (p = list; (p - list) < list_size; p += strlen(p) + 1) { + if (strncmp(p, "system.", 7) == 0 || + strncmp(p, "xfsroot.", 8) == 0) + continue; + setup_xattr(a, entry, p, fd); + } + + free(list); + return (ARCHIVE_OK); +} + +#elif HAVE_EXTATTR_GET_FILE && HAVE_EXTATTR_LIST_FILE + +/* + * FreeBSD extattr interface. + */ + +/* TODO: Implement this. Follow the Linux model above, but + * with FreeBSD-specific system calls, of course. Be careful + * to not include the system extattrs that hold ACLs; we handle + * those separately. + */ +static int +setup_xattr(struct archive_read_disk *a, struct archive_entry *entry, + int namespace, const char *name, const char *fullname, int fd); + +static int +setup_xattr(struct archive_read_disk *a, struct archive_entry *entry, + int namespace, const char *name, const char *fullname, int fd) +{ + ssize_t size; + void *value = NULL; + const char *accpath; + + (void)fd; /* UNUSED */ + + accpath = archive_entry_sourcepath(entry); + if (accpath == NULL) + accpath = archive_entry_pathname(entry); + + if (!a->follow_symlinks) + size = extattr_get_link(accpath, namespace, name, NULL, 0); + else + size = extattr_get_file(accpath, namespace, name, NULL, 0); + + if (size == -1) { + archive_set_error(&a->archive, errno, + "Couldn't query extended attribute"); + return (ARCHIVE_WARN); + } + + if (size > 0 && (value = malloc(size)) == NULL) { + archive_set_error(&a->archive, errno, "Out of memory"); + return (ARCHIVE_FATAL); + } + + if (!a->follow_symlinks) + size = extattr_get_link(accpath, namespace, name, value, size); + else + size = extattr_get_file(accpath, namespace, name, value, size); + + if (size == -1) { + archive_set_error(&a->archive, errno, + "Couldn't read extended attribute"); + return (ARCHIVE_WARN); + } + + archive_entry_xattr_add_entry(entry, fullname, value, size); + + free(value); + return (ARCHIVE_OK); +} + +static int +setup_xattrs(struct archive_read_disk *a, + struct archive_entry *entry, int fd) +{ + char buff[512]; + char *list, *p; + ssize_t list_size; + const char *path; + int namespace = EXTATTR_NAMESPACE_USER; + + path = archive_entry_sourcepath(entry); + if (path == NULL) + path = archive_entry_pathname(entry); + + if (!a->follow_symlinks) + list_size = extattr_list_link(path, namespace, NULL, 0); + else + list_size = extattr_list_file(path, namespace, NULL, 0); + + if (list_size == -1 && errno == EOPNOTSUPP) + return (ARCHIVE_OK); + if (list_size == -1) { + archive_set_error(&a->archive, errno, + "Couldn't list extended attributes"); + return (ARCHIVE_WARN); + } + + if (list_size == 0) + return (ARCHIVE_OK); + + if ((list = malloc(list_size)) == NULL) { + archive_set_error(&a->archive, errno, "Out of memory"); + return (ARCHIVE_FATAL); + } + + if (!a->follow_symlinks) + list_size = extattr_list_link(path, namespace, list, list_size); + else + list_size = extattr_list_file(path, namespace, list, list_size); + + if (list_size == -1) { + archive_set_error(&a->archive, errno, + "Couldn't retrieve extended attributes"); + free(list); + return (ARCHIVE_WARN); + } + + p = list; + while ((p - list) < list_size) { + size_t len = 255 & (int)*p; + char *name; + + strcpy(buff, "user."); + name = buff + strlen(buff); + memcpy(name, p + 1, len); + name[len] = '\0'; + setup_xattr(a, entry, namespace, name, buff, fd); + p += 1 + len; + } + + free(list); + return (ARCHIVE_OK); +} + +#else + +/* + * Generic (stub) extended attribute support. + */ +static int +setup_xattrs(struct archive_read_disk *a, + struct archive_entry *entry, int fd) +{ + (void)a; /* UNUSED */ + (void)entry; /* UNUSED */ + (void)fd; /* UNUSED */ + return (ARCHIVE_OK); +} + +#endif diff --git a/lib/libarchive/archive_read_disk_private.h b/lib/libarchive/archive_read_disk_private.h new file mode 100644 index 000000000..77bae39ef --- /dev/null +++ b/lib/libarchive/archive_read_disk_private.h @@ -0,0 +1,62 @@ +/*- + * Copyright (c) 2003-2009 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD: head/lib/libarchive/archive_read_disk_private.h 201105 2009-12-28 03:20:54Z kientzle $ + */ + +#ifndef __LIBARCHIVE_BUILD +#error This header is only to be used internally to libarchive. +#endif + +#ifndef ARCHIVE_READ_DISK_PRIVATE_H_INCLUDED +#define ARCHIVE_READ_DISK_PRIVATE_H_INCLUDED + +struct archive_read_disk { + struct archive archive; + + /* + * Symlink mode is one of 'L'ogical, 'P'hysical, or 'H'ybrid, + * following an old BSD convention. 'L' follows all symlinks, + * 'P' follows none, 'H' follows symlinks only for the first + * item. + */ + char symlink_mode; + + /* + * Since symlink interaction changes, we need to track whether + * we're following symlinks for the current item. 'L' mode above + * sets this true, 'P' sets it false, 'H' changes it as we traverse. + */ + char follow_symlinks; /* Either 'L' or 'P'. */ + + const char * (*lookup_gname)(void *private, gid_t gid); + void (*cleanup_gname)(void *private); + void *lookup_gname_data; + const char * (*lookup_uname)(void *private, uid_t gid); + void (*cleanup_uname)(void *private); + void *lookup_uname_data; +}; + +#endif diff --git a/lib/libarchive/archive_read_disk_set_standard_lookup.c b/lib/libarchive/archive_read_disk_set_standard_lookup.c new file mode 100644 index 000000000..5c4a0c4b6 --- /dev/null +++ b/lib/libarchive/archive_read_disk_set_standard_lookup.c @@ -0,0 +1,282 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_disk_set_standard_lookup.c 201109 2009-12-28 03:30:31Z kientzle $"); + +#ifdef HAVE_SYS_TYPES_H +#include +#endif +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_GRP_H +#include +#endif +#ifdef HAVE_PWD_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif + +#include "archive.h" + +#if defined(_WIN32) && !defined(__CYGWIN__) +int +archive_read_disk_set_standard_lookup(struct archive *a) +{ + archive_set_error(a, -1, "Standard lookups not available on Windows"); + return (ARCHIVE_FATAL); +} +#else /* ! (_WIN32 && !__CYGWIN__) */ +#define name_cache_size 127 + +static const char * const NO_NAME = "(noname)"; + +struct name_cache { + struct archive *archive; + char *buff; + size_t buff_size; + int probes; + int hits; + size_t size; + struct { + id_t id; + const char *name; + } cache[name_cache_size]; +}; + +static const char * lookup_gname(void *, gid_t); +static const char * lookup_uname(void *, uid_t); +static void cleanup(void *); +static const char * lookup_gname_helper(struct name_cache *, id_t gid); +static const char * lookup_uname_helper(struct name_cache *, id_t uid); + +/* + * Installs functions that use getpwuid()/getgrgid()---along with + * a simple cache to accelerate such lookups---into the archive_read_disk + * object. This is in a separate file because getpwuid()/getgrgid() + * can pull in a LOT of library code (including NIS/LDAP functions, which + * pull in DNS resolveers, etc). This can easily top 500kB, which makes + * it inappropriate for some space-constrained applications. + * + * Applications that are size-sensitive may want to just use the + * real default functions (defined in archive_read_disk.c) that just + * use the uid/gid without the lookup. Or define your own custom functions + * if you prefer. + */ +int +archive_read_disk_set_standard_lookup(struct archive *a) +{ + struct name_cache *ucache = malloc(sizeof(struct name_cache)); + struct name_cache *gcache = malloc(sizeof(struct name_cache)); + + if (ucache == NULL || gcache == NULL) { + archive_set_error(a, ENOMEM, + "Can't allocate uname/gname lookup cache"); + free(ucache); + free(gcache); + return (ARCHIVE_FATAL); + } + + memset(ucache, 0, sizeof(*ucache)); + ucache->archive = a; + ucache->size = name_cache_size; + memset(gcache, 0, sizeof(*gcache)); + gcache->archive = a; + gcache->size = name_cache_size; + + archive_read_disk_set_gname_lookup(a, gcache, lookup_gname, cleanup); + archive_read_disk_set_uname_lookup(a, ucache, lookup_uname, cleanup); + + return (ARCHIVE_OK); +} + +static void +cleanup(void *data) +{ + struct name_cache *cache = (struct name_cache *)data; + size_t i; + + if (cache != NULL) { + for (i = 0; i < cache->size; i++) { + if (cache->cache[i].name != NULL && + cache->cache[i].name != NO_NAME) + free((void *)(uintptr_t)cache->cache[i].name); + } + free(cache->buff); + free(cache); + } +} + +/* + * Lookup uid/gid from uname/gname, return NULL if no match. + */ +static const char * +lookup_name(struct name_cache *cache, + const char * (*lookup_fn)(struct name_cache *, id_t), id_t id) +{ + const char *name; + int slot; + + + cache->probes++; + + slot = id % cache->size; + if (cache->cache[slot].name != NULL) { + if (cache->cache[slot].id == id) { + cache->hits++; + if (cache->cache[slot].name == NO_NAME) + return (NULL); + return (cache->cache[slot].name); + } + if (cache->cache[slot].name != NO_NAME) + free((void *)(uintptr_t)cache->cache[slot].name); + cache->cache[slot].name = NULL; + } + + name = (lookup_fn)(cache, id); + if (name == NULL) { + /* Cache and return the negative response. */ + cache->cache[slot].name = NO_NAME; + cache->cache[slot].id = id; + return (NULL); + } + + cache->cache[slot].name = name; + cache->cache[slot].id = id; + return (cache->cache[slot].name); +} + +static const char * +lookup_uname(void *data, uid_t uid) +{ + struct name_cache *uname_cache = (struct name_cache *)data; + return (lookup_name(uname_cache, + &lookup_uname_helper, (id_t)uid)); +} + +static const char * +lookup_uname_helper(struct name_cache *cache, id_t id) +{ + struct passwd pwent, *result; + int r; + + if (cache->buff_size == 0) { + cache->buff_size = 256; + cache->buff = malloc(cache->buff_size); + } + if (cache->buff == NULL) + return (NULL); + for (;;) { + result = &pwent; /* Old getpwuid_r ignores last arg. */ +#if defined(HAVE_GETPWUID_R) + r = getpwuid_r((uid_t)id, &pwent, + cache->buff, cache->buff_size, &result); +#else + result = getpwuid((uid_t)id); + r = errno; +#endif + if (r == 0) + break; + if (r != ERANGE) + break; + /* ERANGE means our buffer was too small, but POSIX + * doesn't tell us how big the buffer should be, so + * we just double it and try again. Because the buffer + * is kept around in the cache object, we shouldn't + * have to do this very often. */ + cache->buff_size *= 2; + cache->buff = realloc(cache->buff, cache->buff_size); + if (cache->buff == NULL) + break; + } + if (r != 0) { + archive_set_error(cache->archive, errno, + "Can't lookup user for id %d", (int)id); + return (NULL); + } + if (result == NULL) + return (NULL); + + return strdup(result->pw_name); +} + +static const char * +lookup_gname(void *data, gid_t gid) +{ + struct name_cache *gname_cache = (struct name_cache *)data; + return (lookup_name(gname_cache, + &lookup_gname_helper, (id_t)gid)); +} + +static const char * +lookup_gname_helper(struct name_cache *cache, id_t id) +{ + struct group grent, *result; + int r; + + if (cache->buff_size == 0) { + cache->buff_size = 256; + cache->buff = malloc(cache->buff_size); + } + if (cache->buff == NULL) + return (NULL); + for (;;) { + result = &grent; /* Old getgrgid_r ignores last arg. */ +#if defined(HAVE_GETGRGID_R) + r = getgrgid_r((gid_t)id, &grent, + cache->buff, cache->buff_size, &result); +#else + result = getgrgid((gid_t)id); + r = errno; +#endif + if (r == 0) + break; + if (r != ERANGE) + break; + /* ERANGE means our buffer was too small, but POSIX + * doesn't tell us how big the buffer should be, so + * we just double it and try again. */ + cache->buff_size *= 2; + cache->buff = realloc(cache->buff, cache->buff_size); + if (cache->buff == NULL) + break; + } + if (r != 0) { + archive_set_error(cache->archive, errno, + "Can't lookup group for id %d", (int)id); + return (NULL); + } + if (result == NULL) + return (NULL); + + return strdup(result->gr_name); +} +#endif /* ! (_WIN32 && !__CYGWIN__) */ diff --git a/lib/libarchive/archive_read_extract.c b/lib/libarchive/archive_read_extract.c new file mode 100644 index 000000000..e1027995e --- /dev/null +++ b/lib/libarchive/archive_read_extract.c @@ -0,0 +1,182 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: src/lib/libarchive/archive_read_extract.c,v 1.61 2008/05/26 17:00:22 kientzle Exp $"); + +#ifdef HAVE_SYS_TYPES_H +#include +#endif +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif + +#include "archive.h" +#include "archive_private.h" +#include "archive_read_private.h" +#include "archive_write_disk_private.h" + +struct extract { + struct archive *ad; /* archive_write_disk object */ + + /* Progress function invoked during extract. */ + void (*extract_progress)(void *); + void *extract_progress_user_data; +}; + +static int archive_read_extract_cleanup(struct archive_read *); +static int copy_data(struct archive *ar, struct archive *aw); +static struct extract *get_extract(struct archive_read *); + +static struct extract * +get_extract(struct archive_read *a) +{ + /* If we haven't initialized, do it now. */ + /* This also sets up a lot of global state. */ + if (a->extract == NULL) { + a->extract = (struct extract *)malloc(sizeof(*a->extract)); + if (a->extract == NULL) { + archive_set_error(&a->archive, ENOMEM, "Can't extract"); + return (NULL); + } + memset(a->extract, 0, sizeof(*a->extract)); + a->extract->ad = archive_write_disk_new(); + if (a->extract->ad == NULL) { + archive_set_error(&a->archive, ENOMEM, "Can't extract"); + return (NULL); + } + archive_write_disk_set_standard_lookup(a->extract->ad); + a->cleanup_archive_extract = archive_read_extract_cleanup; + } + return (a->extract); +} + +int +archive_read_extract(struct archive *_a, struct archive_entry *entry, int flags) +{ + struct extract *extract; + + extract = get_extract((struct archive_read *)_a); + if (extract == NULL) + return (ARCHIVE_FATAL); + archive_write_disk_set_options(extract->ad, flags); + return (archive_read_extract2(_a, entry, extract->ad)); +} + +int +archive_read_extract2(struct archive *_a, struct archive_entry *entry, + struct archive *ad) +{ + struct archive_read *a = (struct archive_read *)_a; + int r, r2; + + /* Set up for this particular entry. */ + archive_write_disk_set_skip_file(ad, + a->skip_file_dev, a->skip_file_ino); + r = archive_write_header(ad, entry); + if (r < ARCHIVE_WARN) + r = ARCHIVE_WARN; + if (r != ARCHIVE_OK) + /* If _write_header failed, copy the error. */ + archive_copy_error(&a->archive, ad); + else + /* Otherwise, pour data into the entry. */ + r = copy_data(_a, ad); + r2 = archive_write_finish_entry(ad); + if (r2 < ARCHIVE_WARN) + r2 = ARCHIVE_WARN; + /* Use the first message. */ + if (r2 != ARCHIVE_OK && r == ARCHIVE_OK) + archive_copy_error(&a->archive, ad); + /* Use the worst error return. */ + if (r2 < r) + r = r2; + return (r); +} + +void +archive_read_extract_set_progress_callback(struct archive *_a, + void (*progress_func)(void *), void *user_data) +{ + struct archive_read *a = (struct archive_read *)_a; + struct extract *extract = get_extract(a); + if (extract != NULL) { + extract->extract_progress = progress_func; + extract->extract_progress_user_data = user_data; + } +} + +static int +copy_data(struct archive *ar, struct archive *aw) +{ + off_t offset; + const void *buff; + struct extract *extract; + size_t size; + int r; + + extract = get_extract((struct archive_read *)ar); + for (;;) { + r = archive_read_data_block(ar, &buff, &size, &offset); + if (r == ARCHIVE_EOF) + return (ARCHIVE_OK); + if (r != ARCHIVE_OK) + return (r); + r = archive_write_data_block(aw, buff, size, offset); + if (r < ARCHIVE_WARN) + r = ARCHIVE_WARN; + if (r != ARCHIVE_OK) { + archive_set_error(ar, archive_errno(aw), + "%s", archive_error_string(aw)); + return (r); + } + if (extract->extract_progress) + (extract->extract_progress) + (extract->extract_progress_user_data); + } +} + +/* + * Cleanup function for archive_extract. + */ +static int +archive_read_extract_cleanup(struct archive_read *a) +{ + int ret = ARCHIVE_OK; + +#if ARCHIVE_API_VERSION > 1 + ret = +#endif + archive_write_finish(a->extract->ad); + free(a->extract); + a->extract = NULL; + return (ret); +} diff --git a/lib/libarchive/archive_read_open_fd.c b/lib/libarchive/archive_read_open_fd.c new file mode 100644 index 000000000..33b7cba1b --- /dev/null +++ b/lib/libarchive/archive_read_open_fd.c @@ -0,0 +1,186 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_open_fd.c 201103 2009-12-28 03:13:49Z kientzle $"); + +#ifdef HAVE_SYS_STAT_H +#include +#endif +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_FCNTL_H +#include +#endif +#ifdef HAVE_IO_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#ifdef HAVE_UNISTD_H +#include +#endif + +#include "archive.h" + +struct read_fd_data { + int fd; + size_t block_size; + char can_skip; + void *buffer; +}; + +static int file_close(struct archive *, void *); +static ssize_t file_read(struct archive *, void *, const void **buff); +#if ARCHIVE_API_VERSION < 2 +static ssize_t file_skip(struct archive *, void *, size_t request); +#else +static off_t file_skip(struct archive *, void *, off_t request); +#endif + +int +archive_read_open_fd(struct archive *a, int fd, size_t block_size) +{ + struct stat st; + struct read_fd_data *mine; + void *b; + + archive_clear_error(a); + if (fstat(fd, &st) != 0) { + archive_set_error(a, errno, "Can't stat fd %d", fd); + return (ARCHIVE_FATAL); + } + + mine = (struct read_fd_data *)malloc(sizeof(*mine)); + b = malloc(block_size); + if (mine == NULL || b == NULL) { + archive_set_error(a, ENOMEM, "No memory"); + free(mine); + free(b); + return (ARCHIVE_FATAL); + } + mine->block_size = block_size; + mine->buffer = b; + mine->fd = fd; + /* + * Skip support is a performance optimization for anything + * that supports lseek(). On FreeBSD, only regular files and + * raw disk devices support lseek() and there's no portable + * way to determine if a device is a raw disk device, so we + * only enable this optimization for regular files. + */ + if (S_ISREG(st.st_mode)) { + archive_read_extract_set_skip_file(a, st.st_dev, st.st_ino); + mine->can_skip = 1; + } else + mine->can_skip = 0; +#if defined(__CYGWIN__) || defined(_WIN32) + setmode(mine->fd, O_BINARY); +#endif + + return (archive_read_open2(a, mine, + NULL, file_read, file_skip, file_close)); +} + +static ssize_t +file_read(struct archive *a, void *client_data, const void **buff) +{ + struct read_fd_data *mine = (struct read_fd_data *)client_data; + ssize_t bytes_read; + + *buff = mine->buffer; + bytes_read = read(mine->fd, mine->buffer, mine->block_size); + if (bytes_read < 0) { + archive_set_error(a, errno, "Error reading fd %d", mine->fd); + } + return (bytes_read); +} + +#if ARCHIVE_API_VERSION < 2 +static ssize_t +file_skip(struct archive *a, void *client_data, size_t request) +#else +static off_t +file_skip(struct archive *a, void *client_data, off_t request) +#endif +{ + struct read_fd_data *mine = (struct read_fd_data *)client_data; + off_t old_offset, new_offset; + + if (!mine->can_skip) + return (0); + + /* Reduce request to the next smallest multiple of block_size */ + request = (request / mine->block_size) * mine->block_size; + if (request == 0) + return (0); + + /* + * Hurray for lazy evaluation: if the first lseek fails, the second + * one will not be executed. + */ + if (((old_offset = lseek(mine->fd, 0, SEEK_CUR)) < 0) || + ((new_offset = lseek(mine->fd, request, SEEK_CUR)) < 0)) + { + /* If seek failed once, it will probably fail again. */ + mine->can_skip = 0; + + if (errno == ESPIPE) + { + /* + * Failure to lseek() can be caused by the file + * descriptor pointing to a pipe, socket or FIFO. + * Return 0 here, so the compression layer will use + * read()s instead to advance the file descriptor. + * It's slower of course, but works as well. + */ + return (0); + } + /* + * There's been an error other than ESPIPE. This is most + * likely caused by a programmer error (too large request) + * or a corrupted archive file. + */ + archive_set_error(a, errno, "Error seeking"); + return (-1); + } + return (new_offset - old_offset); +} + +static int +file_close(struct archive *a, void *client_data) +{ + struct read_fd_data *mine = (struct read_fd_data *)client_data; + + (void)a; /* UNUSED */ + free(mine->buffer); + free(mine); + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_read_open_file.c b/lib/libarchive/archive_read_open_file.c new file mode 100644 index 000000000..095ae6eb5 --- /dev/null +++ b/lib/libarchive/archive_read_open_file.c @@ -0,0 +1,165 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_open_file.c 201093 2009-12-28 02:28:44Z kientzle $"); + +#ifdef HAVE_SYS_STAT_H +#include +#endif +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_FCNTL_H +#include +#endif +#ifdef HAVE_IO_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#ifdef HAVE_UNISTD_H +#include +#endif + +#include "archive.h" + +struct read_FILE_data { + FILE *f; + size_t block_size; + void *buffer; + char can_skip; +}; + +static int file_close(struct archive *, void *); +static ssize_t file_read(struct archive *, void *, const void **buff); +#if ARCHIVE_API_VERSION < 2 +static ssize_t file_skip(struct archive *, void *, size_t request); +#else +static off_t file_skip(struct archive *, void *, off_t request); +#endif + +int +archive_read_open_FILE(struct archive *a, FILE *f) +{ + struct stat st; + struct read_FILE_data *mine; + size_t block_size = 128 * 1024; + void *b; + + archive_clear_error(a); + mine = (struct read_FILE_data *)malloc(sizeof(*mine)); + b = malloc(block_size); + if (mine == NULL || b == NULL) { + archive_set_error(a, ENOMEM, "No memory"); + free(mine); + free(b); + return (ARCHIVE_FATAL); + } + mine->block_size = block_size; + mine->buffer = b; + mine->f = f; + /* + * If we can't fstat() the file, it may just be that it's not + * a file. (FILE * objects can wrap many kinds of I/O + * streams, some of which don't support fileno()).) + */ + if (fstat(fileno(mine->f), &st) == 0 && S_ISREG(st.st_mode)) { + archive_read_extract_set_skip_file(a, st.st_dev, st.st_ino); + /* Enable the seek optimization only for regular files. */ + mine->can_skip = 1; + } else + mine->can_skip = 0; + +#if defined(__CYGWIN__) || defined(_WIN32) + setmode(fileno(mine->f), O_BINARY); +#endif + + return (archive_read_open2(a, mine, NULL, file_read, + file_skip, file_close)); +} + +static ssize_t +file_read(struct archive *a, void *client_data, const void **buff) +{ + struct read_FILE_data *mine = (struct read_FILE_data *)client_data; + ssize_t bytes_read; + + *buff = mine->buffer; + bytes_read = fread(mine->buffer, 1, mine->block_size, mine->f); + if (bytes_read < 0) { + archive_set_error(a, errno, "Error reading file"); + } + return (bytes_read); +} + +#if ARCHIVE_API_VERSION < 2 +static ssize_t +file_skip(struct archive *a, void *client_data, size_t request) +#else +static off_t +file_skip(struct archive *a, void *client_data, off_t request) +#endif +{ + struct read_FILE_data *mine = (struct read_FILE_data *)client_data; + + (void)a; /* UNUSED */ + + /* + * If we can't skip, return 0 as the amount we did step and + * the caller will work around by reading and discarding. + */ + if (!mine->can_skip) + return (0); + if (request == 0) + return (0); + +#if HAVE_FSEEKO + if (fseeko(mine->f, request, SEEK_CUR) != 0) +#else + if (fseek(mine->f, request, SEEK_CUR) != 0) +#endif + { + mine->can_skip = 0; + return (0); + } + return (request); +} + +static int +file_close(struct archive *a, void *client_data) +{ + struct read_FILE_data *mine = (struct read_FILE_data *)client_data; + + (void)a; /* UNUSED */ + if (mine->buffer != NULL) + free(mine->buffer); + free(mine); + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_read_open_filename.c b/lib/libarchive/archive_read_open_filename.c new file mode 100644 index 000000000..607b80c56 --- /dev/null +++ b/lib/libarchive/archive_read_open_filename.c @@ -0,0 +1,268 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_open_filename.c 201093 2009-12-28 02:28:44Z kientzle $"); + +#ifdef HAVE_SYS_STAT_H +#include +#endif +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_FCNTL_H +#include +#endif +#ifdef HAVE_IO_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#ifdef HAVE_UNISTD_H +#include +#endif + +#include "archive.h" + +#ifndef O_BINARY +#define O_BINARY 0 +#endif + +struct read_file_data { + int fd; + size_t block_size; + void *buffer; + mode_t st_mode; /* Mode bits for opened file. */ + char can_skip; /* This file supports skipping. */ + char filename[1]; /* Must be last! */ +}; + +static int file_close(struct archive *, void *); +static ssize_t file_read(struct archive *, void *, const void **buff); +#if ARCHIVE_API_VERSION < 2 +static ssize_t file_skip(struct archive *, void *, size_t request); +#else +static off_t file_skip(struct archive *, void *, off_t request); +#endif + +int +archive_read_open_file(struct archive *a, const char *filename, + size_t block_size) +{ + return (archive_read_open_filename(a, filename, block_size)); +} + +int +archive_read_open_filename(struct archive *a, const char *filename, + size_t block_size) +{ + struct stat st; + struct read_file_data *mine; + void *b; + int fd; + + archive_clear_error(a); + if (filename == NULL || filename[0] == '\0') { + /* We used to invoke archive_read_open_fd(a,0,block_size) + * here, but that doesn't (and shouldn't) handle the + * end-of-file flush when reading stdout from a pipe. + * Basically, read_open_fd() is intended for folks who + * are willing to handle such details themselves. This + * API is intended to be a little smarter for folks who + * want easy handling of the common case. + */ + filename = ""; /* Normalize NULL to "" */ + fd = 0; +#if defined(__CYGWIN__) || defined(_WIN32) + setmode(0, O_BINARY); +#endif + } else { + fd = open(filename, O_RDONLY | O_BINARY); + if (fd < 0) { + archive_set_error(a, errno, + "Failed to open '%s'", filename); + return (ARCHIVE_FATAL); + } + } + if (fstat(fd, &st) != 0) { + archive_set_error(a, errno, "Can't stat '%s'", filename); + return (ARCHIVE_FATAL); + } + + mine = (struct read_file_data *)calloc(1, + sizeof(*mine) + strlen(filename)); + b = malloc(block_size); + if (mine == NULL || b == NULL) { + archive_set_error(a, ENOMEM, "No memory"); + free(mine); + free(b); + return (ARCHIVE_FATAL); + } + strcpy(mine->filename, filename); + mine->block_size = block_size; + mine->buffer = b; + mine->fd = fd; + /* Remember mode so close can decide whether to flush. */ + mine->st_mode = st.st_mode; + /* If we're reading a file from disk, ensure that we don't + overwrite it with an extracted file. */ + if (S_ISREG(st.st_mode)) { + archive_read_extract_set_skip_file(a, st.st_dev, st.st_ino); + /* + * Enabling skip here is a performance optimization + * for anything that supports lseek(). On FreeBSD + * (and probably many other systems), only regular + * files and raw disk devices support lseek() (on + * other input types, lseek() returns success but + * doesn't actually change the file pointer, which + * just completely screws up the position-tracking + * logic). In addition, I've yet to find a portable + * way to determine if a device is a raw disk device. + * So I don't see a way to do much better than to only + * enable this optimization for regular files. + */ + mine->can_skip = 1; + } + return (archive_read_open2(a, mine, + NULL, file_read, file_skip, file_close)); +} + +static ssize_t +file_read(struct archive *a, void *client_data, const void **buff) +{ + struct read_file_data *mine = (struct read_file_data *)client_data; + ssize_t bytes_read; + + *buff = mine->buffer; + bytes_read = read(mine->fd, mine->buffer, mine->block_size); + if (bytes_read < 0) { + if (mine->filename[0] == '\0') + archive_set_error(a, errno, "Error reading stdin"); + else + archive_set_error(a, errno, "Error reading '%s'", + mine->filename); + } + return (bytes_read); +} + +#if ARCHIVE_API_VERSION < 2 +static ssize_t +file_skip(struct archive *a, void *client_data, size_t request) +#else +static off_t +file_skip(struct archive *a, void *client_data, off_t request) +#endif +{ + struct read_file_data *mine = (struct read_file_data *)client_data; + off_t old_offset, new_offset; + + if (!mine->can_skip) /* We can't skip, so ... */ + return (0); /* ... skip zero bytes. */ + + /* Reduce request to the next smallest multiple of block_size */ + request = (request / mine->block_size) * mine->block_size; + if (request == 0) + return (0); + + /* + * Hurray for lazy evaluation: if the first lseek fails, the second + * one will not be executed. + */ + if (((old_offset = lseek(mine->fd, 0, SEEK_CUR)) < 0) || + ((new_offset = lseek(mine->fd, request, SEEK_CUR)) < 0)) + { + /* If skip failed once, it will probably fail again. */ + mine->can_skip = 0; + + if (errno == ESPIPE) + { + /* + * Failure to lseek() can be caused by the file + * descriptor pointing to a pipe, socket or FIFO. + * Return 0 here, so the compression layer will use + * read()s instead to advance the file descriptor. + * It's slower of course, but works as well. + */ + return (0); + } + /* + * There's been an error other than ESPIPE. This is most + * likely caused by a programmer error (too large request) + * or a corrupted archive file. + */ + if (mine->filename[0] == '\0') + /* + * Should never get here, since lseek() on stdin ought + * to return an ESPIPE error. + */ + archive_set_error(a, errno, "Error seeking in stdin"); + else + archive_set_error(a, errno, "Error seeking in '%s'", + mine->filename); + return (-1); + } + return (new_offset - old_offset); +} + +static int +file_close(struct archive *a, void *client_data) +{ + struct read_file_data *mine = (struct read_file_data *)client_data; + + (void)a; /* UNUSED */ + + /* Only flush and close if open succeeded. */ + if (mine->fd >= 0) { + /* + * Sometimes, we should flush the input before closing. + * Regular files: faster to just close without flush. + * Devices: must not flush (user might need to + * read the "next" item on a non-rewind device). + * Pipes and sockets: must flush (otherwise, the + * program feeding the pipe or socket may complain). + * Here, I flush everything except for regular files and + * device nodes. + */ + if (!S_ISREG(mine->st_mode) + && !S_ISCHR(mine->st_mode) + && !S_ISBLK(mine->st_mode)) { + ssize_t bytesRead; + do { + bytesRead = read(mine->fd, mine->buffer, + mine->block_size); + } while (bytesRead > 0); + } + /* If a named file was opened, then it needs to be closed. */ + if (mine->filename[0] != '\0') + close(mine->fd); + } + free(mine->buffer); + free(mine); + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_read_open_memory.c b/lib/libarchive/archive_read_open_memory.c new file mode 100644 index 000000000..61f574fa7 --- /dev/null +++ b/lib/libarchive/archive_read_open_memory.c @@ -0,0 +1,156 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: src/lib/libarchive/archive_read_open_memory.c,v 1.6 2007/07/06 15:51:59 kientzle Exp $"); + +#include +#include +#include + +#include "archive.h" + +/* + * Glue to read an archive from a block of memory. + * + * This is mostly a huge help in building test harnesses; + * test programs can build archives in memory and read them + * back again without having to mess with files on disk. + */ + +struct read_memory_data { + unsigned char *buffer; + unsigned char *end; + ssize_t read_size; +}; + +static int memory_read_close(struct archive *, void *); +static int memory_read_open(struct archive *, void *); +#if ARCHIVE_API_VERSION < 2 +static ssize_t memory_read_skip(struct archive *, void *, size_t request); +#else +static off_t memory_read_skip(struct archive *, void *, off_t request); +#endif +static ssize_t memory_read(struct archive *, void *, const void **buff); + +int +archive_read_open_memory(struct archive *a, void *buff, size_t size) +{ + return archive_read_open_memory2(a, buff, size, size); +} + +/* + * Don't use _open_memory2() in production code; the archive_read_open_memory() + * version is the one you really want. This is just here so that + * test harnesses can exercise block operations inside the library. + */ +int +archive_read_open_memory2(struct archive *a, void *buff, + size_t size, size_t read_size) +{ + struct read_memory_data *mine; + + mine = (struct read_memory_data *)malloc(sizeof(*mine)); + if (mine == NULL) { + archive_set_error(a, ENOMEM, "No memory"); + return (ARCHIVE_FATAL); + } + memset(mine, 0, sizeof(*mine)); + mine->buffer = (unsigned char *)buff; + mine->end = mine->buffer + size; + mine->read_size = read_size; + return (archive_read_open2(a, mine, memory_read_open, + memory_read, memory_read_skip, memory_read_close)); +} + +/* + * There's nothing to open. + */ +static int +memory_read_open(struct archive *a, void *client_data) +{ + (void)a; /* UNUSED */ + (void)client_data; /* UNUSED */ + return (ARCHIVE_OK); +} + +/* + * This is scary simple: Just advance a pointer. Limiting + * to read_size is not technically necessary, but it exercises + * more of the internal logic when used with a small block size + * in a test harness. Production use should not specify a block + * size; then this is much faster. + */ +static ssize_t +memory_read(struct archive *a, void *client_data, const void **buff) +{ + struct read_memory_data *mine = (struct read_memory_data *)client_data; + ssize_t size; + + (void)a; /* UNUSED */ + *buff = mine->buffer; + size = mine->end - mine->buffer; + if (size > mine->read_size) + size = mine->read_size; + mine->buffer += size; + return (size); +} + +/* + * Advancing is just as simple. Again, this is doing more than + * necessary in order to better exercise internal code when used + * as a test harness. + */ +#if ARCHIVE_API_VERSION < 2 +static ssize_t +memory_read_skip(struct archive *a, void *client_data, size_t skip) +#else +static off_t +memory_read_skip(struct archive *a, void *client_data, off_t skip) +#endif +{ + struct read_memory_data *mine = (struct read_memory_data *)client_data; + + (void)a; /* UNUSED */ + if ((off_t)skip > (off_t)(mine->end - mine->buffer)) + skip = mine->end - mine->buffer; + /* Round down to block size. */ + skip /= mine->read_size; + skip *= mine->read_size; + mine->buffer += skip; + return (skip); +} + +/* + * Close is just cleaning up our one small bit of data. + */ +static int +memory_read_close(struct archive *a, void *client_data) +{ + struct read_memory_data *mine = (struct read_memory_data *)client_data; + (void)a; /* UNUSED */ + free(mine); + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_read_private.h b/lib/libarchive/archive_read_private.h new file mode 100644 index 000000000..ab7759a0b --- /dev/null +++ b/lib/libarchive/archive_read_private.h @@ -0,0 +1,213 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD: head/lib/libarchive/archive_read_private.h 201088 2009-12-28 02:18:55Z kientzle $ + */ + +#ifndef __LIBARCHIVE_BUILD +#error This header is only to be used internally to libarchive. +#endif + +#ifndef ARCHIVE_READ_PRIVATE_H_INCLUDED +#define ARCHIVE_READ_PRIVATE_H_INCLUDED + +#include "archive.h" +#include "archive_string.h" +#include "archive_private.h" + +struct archive_read; +struct archive_read_filter_bidder; +struct archive_read_filter; + +/* + * How bidding works for filters: + * * The bid manager reads the first block from the current source. + * * It shows that block to each registered bidder. + * * The bid manager creates a new filter structure for the winning + * bidder and gives the winning bidder a chance to initialize it. + * * The new filter becomes the top filter in the archive_read structure + * and we repeat the process. + * This ends only when no bidder provides a non-zero bid. + */ +struct archive_read_filter_bidder { + /* Configuration data for the bidder. */ + void *data; + /* Taste the upstream filter to see if we handle this. */ + int (*bid)(struct archive_read_filter_bidder *, + struct archive_read_filter *); + /* Initialize a newly-created filter. */ + int (*init)(struct archive_read_filter *); + /* Set an option for the filter bidder. */ + int (*options)(struct archive_read_filter_bidder *, + const char *key, const char *value); + /* Release the bidder's configuration data. */ + int (*free)(struct archive_read_filter_bidder *); +}; + +/* + * This structure is allocated within the archive_read core + * and initialized by archive_read and the init() method of the + * corresponding bidder above. + */ +struct archive_read_filter { + /* Essentially all filters will need these values, so + * just declare them here. */ + struct archive_read_filter_bidder *bidder; /* My bidder. */ + struct archive_read_filter *upstream; /* Who I read from. */ + struct archive_read *archive; /* Associated archive. */ + /* Return next block. */ + ssize_t (*read)(struct archive_read_filter *, const void **); + /* Skip forward this many bytes. */ +#ifndef __minix + int64_t (*skip)(struct archive_read_filter *self, int64_t request); +#else + ssize_t (*skip)(struct archive_read_filter *self, ssize_t request); +#endif + /* Close (just this filter) and free(self). */ + int (*close)(struct archive_read_filter *self); + /* My private data. */ + void *data; + + const char *name; + int code; + + /* Used by reblocking logic. */ + char *buffer; + size_t buffer_size; + char *next; /* Current read location. */ + size_t avail; /* Bytes in my buffer. */ + const void *client_buff; /* Client buffer information. */ + size_t client_total; + const char *client_next; + size_t client_avail; +#ifndef __minix + int64_t position; +#else + off_t position; +#endif + char end_of_file; + char fatal; +}; + +/* + * The client looks a lot like a filter, so we just wrap it here. + * + * TODO: Make archive_read_filter and archive_read_client identical so + * that users of the library can easily register their own + * transformation filters. This will probably break the API/ABI and + * so should be deferred at least until libarchive 3.0. + */ +struct archive_read_client { + archive_read_callback *reader; + archive_skip_callback *skipper; + archive_close_callback *closer; +}; + +struct archive_read { + struct archive archive; + + struct archive_entry *entry; + + /* Dev/ino of the archive being read/written. */ + dev_t skip_file_dev; + ino_t skip_file_ino; + + /* + * Used by archive_read_data() to track blocks and copy + * data to client buffers, filling gaps with zero bytes. + */ + const char *read_data_block; + off_t read_data_offset; + off_t read_data_output_offset; + size_t read_data_remaining; + + /* Callbacks to open/read/write/close client archive stream. */ + struct archive_read_client client; + + /* Registered filter bidders. */ + struct archive_read_filter_bidder bidders[8]; + + /* Last filter in chain */ + struct archive_read_filter *filter; + + /* File offset of beginning of most recently-read header. */ + off_t header_position; + + /* + * Format detection is mostly the same as compression + * detection, with one significant difference: The bidders + * use the read_ahead calls above to examine the stream rather + * than having the supervisor hand them a block of data to + * examine. + */ + + struct archive_format_descriptor { + void *data; + const char *name; + int (*bid)(struct archive_read *); + int (*options)(struct archive_read *, const char *key, + const char *value); + int (*read_header)(struct archive_read *, struct archive_entry *); + int (*read_data)(struct archive_read *, const void **, size_t *, off_t *); + int (*read_data_skip)(struct archive_read *); + int (*cleanup)(struct archive_read *); + } formats[9]; + struct archive_format_descriptor *format; /* Active format. */ + + /* + * Various information needed by archive_extract. + */ + struct extract *extract; + int (*cleanup_archive_extract)(struct archive_read *); +}; + +int __archive_read_register_format(struct archive_read *a, + void *format_data, + const char *name, + int (*bid)(struct archive_read *), + int (*options)(struct archive_read *, const char *, const char *), + int (*read_header)(struct archive_read *, struct archive_entry *), + int (*read_data)(struct archive_read *, const void **, size_t *, off_t *), + int (*read_data_skip)(struct archive_read *), + int (*cleanup)(struct archive_read *)); + +struct archive_read_filter_bidder + *__archive_read_get_bidder(struct archive_read *a); + +const void *__archive_read_ahead(struct archive_read *, size_t, ssize_t *); +const void *__archive_read_filter_ahead(struct archive_read_filter *, + size_t, ssize_t *); +ssize_t __archive_read_consume(struct archive_read *, size_t); +ssize_t __archive_read_filter_consume(struct archive_read_filter *, size_t); +#ifndef __minix +int64_t __archive_read_skip(struct archive_read *, int64_t); +int64_t __archive_read_skip_lenient(struct archive_read *, int64_t); +int64_t __archive_read_filter_skip(struct archive_read_filter *, int64_t); +#else +ssize_t __archive_read_skip(struct archive_read *, ssize_t); +ssize_t __archive_read_skip_lenient(struct archive_read *, ssize_t); +ssize_t __archive_read_filter_skip(struct archive_read_filter *, ssize_t); +#endif /* __minix */ +int __archive_read_program(struct archive_read_filter *, const char *); +#endif diff --git a/lib/libarchive/archive_read_support_compression_all.c b/lib/libarchive/archive_read_support_compression_all.c new file mode 100644 index 000000000..38d3c81db --- /dev/null +++ b/lib/libarchive/archive_read_support_compression_all.c @@ -0,0 +1,61 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_support_compression_all.c 201248 2009-12-30 06:12:03Z kientzle $"); + +#include "archive.h" + +int +archive_read_support_compression_all(struct archive *a) +{ + /* Bzip falls back to "bunzip2" command-line */ + archive_read_support_compression_bzip2(a); + /* The decompress code doesn't use an outside library. */ + archive_read_support_compression_compress(a); + /* Gzip decompress falls back to "gunzip" command-line. */ + archive_read_support_compression_gzip(a); + /* The LZMA file format has a very weak signature, so it + * may not be feasible to keep this here, but we'll try. + * This will come back out if there are problems. */ + /* Lzma falls back to "unlzma" command-line program. */ + archive_read_support_compression_lzma(a); + /* Xz falls back to "unxz" command-line program. */ + archive_read_support_compression_xz(a); + /* The decode code doesn't use an outside library. */ + archive_read_support_compression_uu(a); + /* The decode code doesn't use an outside library. */ +#ifndef __minix + archive_read_support_compression_rpm(a); +#endif + /* Note: We always return ARCHIVE_OK here, even if some of the + * above return ARCHIVE_WARN. The intent here is to enable + * "as much as possible." Clients who need specific + * compression should enable those individually so they can + * verify the level of support. */ + /* Clear any warning messages set by the above functions. */ + archive_clear_error(a); + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_read_support_compression_bzip2.c b/lib/libarchive/archive_read_support_compression_bzip2.c new file mode 100644 index 000000000..1e45f2202 --- /dev/null +++ b/lib/libarchive/archive_read_support_compression_bzip2.c @@ -0,0 +1,353 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" + +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_support_compression_bzip2.c 201108 2009-12-28 03:28:21Z kientzle $"); + +#ifdef HAVE_ERRNO_H +#include +#endif +#include +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#ifdef HAVE_UNISTD_H +#include +#endif +#ifdef HAVE_BZLIB_H +#include +#endif + +#include "archive.h" +#include "archive_private.h" +#include "archive_read_private.h" + +#if HAVE_BZLIB_H +struct private_data { + bz_stream stream; + char *out_block; + size_t out_block_size; + char valid; /* True = decompressor is initialized */ + char eof; /* True = found end of compressed data. */ +}; + +/* Bzip2 filter */ +static ssize_t bzip2_filter_read(struct archive_read_filter *, const void **); +static int bzip2_filter_close(struct archive_read_filter *); +#endif + +/* + * Note that we can detect bzip2 archives even if we can't decompress + * them. (In fact, we like detecting them because we can give better + * error messages.) So the bid framework here gets compiled even + * if bzlib is unavailable. + */ +static int bzip2_reader_bid(struct archive_read_filter_bidder *, struct archive_read_filter *); +static int bzip2_reader_init(struct archive_read_filter *); +static int bzip2_reader_free(struct archive_read_filter_bidder *); + +int +archive_read_support_compression_bzip2(struct archive *_a) +{ + struct archive_read *a = (struct archive_read *)_a; + struct archive_read_filter_bidder *reader = __archive_read_get_bidder(a); + + if (reader == NULL) + return (ARCHIVE_FATAL); + + reader->data = NULL; + reader->bid = bzip2_reader_bid; + reader->init = bzip2_reader_init; + reader->options = NULL; + reader->free = bzip2_reader_free; +#if HAVE_BZLIB_H + return (ARCHIVE_OK); +#else + archive_set_error(_a, ARCHIVE_ERRNO_MISC, + "Using external bunzip2 program"); + return (ARCHIVE_WARN); +#endif +} + +static int +bzip2_reader_free(struct archive_read_filter_bidder *self){ + (void)self; /* UNUSED */ + return (ARCHIVE_OK); +} + +/* + * Test whether we can handle this data. + * + * This logic returns zero if any part of the signature fails. It + * also tries to Do The Right Thing if a very short buffer prevents us + * from verifying as much as we would like. + */ +static int +bzip2_reader_bid(struct archive_read_filter_bidder *self, struct archive_read_filter *filter) +{ + const unsigned char *buffer; + ssize_t avail; + int bits_checked; + + (void)self; /* UNUSED */ + + /* Minimal bzip2 archive is 14 bytes. */ + buffer = __archive_read_filter_ahead(filter, 14, &avail); + if (buffer == NULL) + return (0); + + /* First three bytes must be "BZh" */ + bits_checked = 0; + if (buffer[0] != 'B' || buffer[1] != 'Z' || buffer[2] != 'h') + return (0); + bits_checked += 24; + + /* Next follows a compression flag which must be an ASCII digit. */ + if (buffer[3] < '1' || buffer[3] > '9') + return (0); + bits_checked += 5; + + /* After BZh[1-9], there must be either a data block + * which begins with 0x314159265359 or an end-of-data + * marker of 0x177245385090. */ + if (memcmp(buffer + 4, "\x31\x41\x59\x26\x53\x59", 6) == 0) + bits_checked += 48; + else if (memcmp(buffer + 4, "\x17\x72\x45\x38\x50\x90", 6) == 0) + bits_checked += 48; + else + return (0); + + return (bits_checked); +} + +#ifndef HAVE_BZLIB_H + +/* + * If we don't have the library on this system, we can't actually do the + * decompression. We can, however, still detect compressed archives + * and emit a useful message. + */ +static int +bzip2_reader_init(struct archive_read_filter *self) +{ + int r; + + r = __archive_read_program(self, "bunzip2"); + /* Note: We set the format here even if __archive_read_program() + * above fails. We do, after all, know what the format is + * even if we weren't able to read it. */ + self->code = ARCHIVE_COMPRESSION_BZIP2; + self->name = "bzip2"; + return (r); +} + + +#else + +/* + * Setup the callbacks. + */ +static int +bzip2_reader_init(struct archive_read_filter *self) +{ + static const size_t out_block_size = 64 * 1024; + void *out_block; + struct private_data *state; + + self->code = ARCHIVE_COMPRESSION_BZIP2; + self->name = "bzip2"; + + state = (struct private_data *)calloc(sizeof(*state), 1); + out_block = (unsigned char *)malloc(out_block_size); + if (self == NULL || state == NULL || out_block == NULL) { + archive_set_error(&self->archive->archive, ENOMEM, + "Can't allocate data for bzip2 decompression"); + free(out_block); + free(state); + return (ARCHIVE_FATAL); + } + + self->data = state; + state->out_block_size = out_block_size; + state->out_block = out_block; + self->read = bzip2_filter_read; + self->skip = NULL; /* not supported */ + self->close = bzip2_filter_close; + + return (ARCHIVE_OK); +} + +/* + * Return the next block of decompressed data. + */ +static ssize_t +bzip2_filter_read(struct archive_read_filter *self, const void **p) +{ + struct private_data *state; + size_t decompressed; + const char *read_buf; + ssize_t ret; + + state = (struct private_data *)self->data; + + if (state->eof) { + *p = NULL; + return (0); + } + + /* Empty our output buffer. */ + state->stream.next_out = state->out_block; + state->stream.avail_out = state->out_block_size; + + /* Try to fill the output buffer. */ + for (;;) { + if (!state->valid) { + if (bzip2_reader_bid(self->bidder, self->upstream) == 0) { + state->eof = 1; + *p = state->out_block; + decompressed = state->stream.next_out + - state->out_block; + return (decompressed); + } + /* Initialize compression library. */ + ret = BZ2_bzDecompressInit(&(state->stream), + 0 /* library verbosity */, + 0 /* don't use low-mem algorithm */); + + /* If init fails, try low-memory algorithm instead. */ + if (ret == BZ_MEM_ERROR) + ret = BZ2_bzDecompressInit(&(state->stream), + 0 /* library verbosity */, + 1 /* do use low-mem algo */); + + if (ret != BZ_OK) { + const char *detail = NULL; + int err = ARCHIVE_ERRNO_MISC; + switch (ret) { + case BZ_PARAM_ERROR: + detail = "invalid setup parameter"; + break; + case BZ_MEM_ERROR: + err = ENOMEM; + detail = "out of memory"; + break; + case BZ_CONFIG_ERROR: + detail = "mis-compiled library"; + break; + } + archive_set_error(&self->archive->archive, err, + "Internal error initializing decompressor%s%s", + detail == NULL ? "" : ": ", + detail); + return (ARCHIVE_FATAL); + } + state->valid = 1; + } + + /* stream.next_in is really const, but bzlib + * doesn't declare it so. */ + read_buf = + __archive_read_filter_ahead(self->upstream, 1, &ret); + if (read_buf == NULL) + return (ARCHIVE_FATAL); + state->stream.next_in = (char *)(uintptr_t)read_buf; + state->stream.avail_in = ret; + /* There is no more data, return whatever we have. */ + if (ret == 0) { + state->eof = 1; + *p = state->out_block; + decompressed = state->stream.next_out + - state->out_block; + return (decompressed); + } + + /* Decompress as much as we can in one pass. */ + ret = BZ2_bzDecompress(&(state->stream)); + __archive_read_filter_consume(self->upstream, + state->stream.next_in - read_buf); + + switch (ret) { + case BZ_STREAM_END: /* Found end of stream. */ + switch (BZ2_bzDecompressEnd(&(state->stream))) { + case BZ_OK: + break; + default: + archive_set_error(&(self->archive->archive), + ARCHIVE_ERRNO_MISC, + "Failed to clean up decompressor"); + return (ARCHIVE_FATAL); + } + state->valid = 0; + /* FALLTHROUGH */ + case BZ_OK: /* Decompressor made some progress. */ + /* If we filled our buffer, update stats and return. */ + if (state->stream.avail_out == 0) { + *p = state->out_block; + decompressed = state->stream.next_out + - state->out_block; + return (decompressed); + } + break; + default: /* Return an error. */ + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, "bzip decompression failed"); + return (ARCHIVE_FATAL); + } + } +} + +/* + * Clean up the decompressor. + */ +static int +bzip2_filter_close(struct archive_read_filter *self) +{ + struct private_data *state; + int ret = ARCHIVE_OK; + + state = (struct private_data *)self->data; + + if (state->valid) { + switch (BZ2_bzDecompressEnd(&state->stream)) { + case BZ_OK: + break; + default: + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "Failed to clean up decompressor"); + ret = ARCHIVE_FATAL; + } + } + + free(state->out_block); + free(state); + return (ret); +} + +#endif /* HAVE_BZLIB_H */ diff --git a/lib/libarchive/archive_read_support_compression_compress.c b/lib/libarchive/archive_read_support_compression_compress.c new file mode 100644 index 000000000..2461975e5 --- /dev/null +++ b/lib/libarchive/archive_read_support_compression_compress.c @@ -0,0 +1,444 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +/* + * This code borrows heavily from "compress" source code, which is + * protected by the following copyright. (Clause 3 dropped by request + * of the Regents.) + */ + +/*- + * Copyright (c) 1985, 1986, 1992, 1993 + * The Regents of the University of California. All rights reserved. + * + * This code is derived from software contributed to Berkeley by + * Diomidis Spinellis and James A. Woods, derived from original + * work by Spencer Thomas and Joseph Orost. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 4. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_support_compression_compress.c 201094 2009-12-28 02:29:21Z kientzle $"); + +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#ifdef HAVE_UNISTD_H +#include +#endif + +#include "archive.h" +#include "archive_private.h" +#include "archive_read_private.h" + +/* + * Because LZW decompression is pretty simple, I've just implemented + * the whole decompressor here (cribbing from "compress" source code, + * of course), rather than relying on an external library. I have + * made an effort to clarify and simplify the algorithm, so the + * names and structure here don't exactly match those used by compress. + */ + +struct private_data { + /* Input variables. */ + const unsigned char *next_in; + size_t avail_in; + int bit_buffer; + int bits_avail; + size_t bytes_in_section; + + /* Output variables. */ + size_t out_block_size; + void *out_block; + + /* Decompression status variables. */ + int use_reset_code; + int end_of_stream; /* EOF status. */ + int maxcode; /* Largest code. */ + int maxcode_bits; /* Length of largest code. */ + int section_end_code; /* When to increase bits. */ + int bits; /* Current code length. */ + int oldcode; /* Previous code. */ + int finbyte; /* Last byte of prev code. */ + + /* Dictionary. */ + int free_ent; /* Next dictionary entry. */ + unsigned char suffix[65536]; + uint16_t prefix[65536]; + + /* + * Scratch area for expanding dictionary entries. Note: + * "worst" case here comes from compressing /dev/zero: the + * last code in the dictionary will code a sequence of + * 65536-256 zero bytes. Thus, we need stack space to expand + * a 65280-byte dictionary entry. (Of course, 32640:1 + * compression could also be considered the "best" case. ;-) + */ + unsigned char *stackp; + unsigned char stack[65300]; +}; + +static int compress_bidder_bid(struct archive_read_filter_bidder *, struct archive_read_filter *); +static int compress_bidder_init(struct archive_read_filter *); +static int compress_bidder_free(struct archive_read_filter_bidder *); + +static ssize_t compress_filter_read(struct archive_read_filter *, const void **); +static int compress_filter_close(struct archive_read_filter *); + +static int getbits(struct archive_read_filter *, int n); +static int next_code(struct archive_read_filter *); + +int +archive_read_support_compression_compress(struct archive *_a) +{ + struct archive_read *a = (struct archive_read *)_a; + struct archive_read_filter_bidder *bidder = __archive_read_get_bidder(a); + + if (bidder == NULL) + return (ARCHIVE_FATAL); + + bidder->data = NULL; + bidder->bid = compress_bidder_bid; + bidder->init = compress_bidder_init; + bidder->options = NULL; + bidder->free = compress_bidder_free; + return (ARCHIVE_OK); +} + +/* + * Test whether we can handle this data. + * + * This logic returns zero if any part of the signature fails. It + * also tries to Do The Right Thing if a very short buffer prevents us + * from verifying as much as we would like. + */ +static int +compress_bidder_bid(struct archive_read_filter_bidder *self, + struct archive_read_filter *filter) +{ + const unsigned char *buffer; + ssize_t avail; + int bits_checked; + + (void)self; /* UNUSED */ + + buffer = __archive_read_filter_ahead(filter, 2, &avail); + + if (buffer == NULL) + return (0); + + bits_checked = 0; + if (buffer[0] != 037) /* Verify first ID byte. */ + return (0); + bits_checked += 8; + + if (buffer[1] != 0235) /* Verify second ID byte. */ + return (0); + bits_checked += 8; + + /* + * TODO: Verify more. + */ + + return (bits_checked); +} + +/* + * Setup the callbacks. + */ +static int +compress_bidder_init(struct archive_read_filter *self) +{ + struct private_data *state; + static const size_t out_block_size = 64 * 1024; + void *out_block; + int code; + + self->code = ARCHIVE_COMPRESSION_COMPRESS; + self->name = "compress (.Z)"; + + state = (struct private_data *)calloc(sizeof(*state), 1); + out_block = malloc(out_block_size); + if (state == NULL || out_block == NULL) { + free(out_block); + free(state); + archive_set_error(&self->archive->archive, ENOMEM, + "Can't allocate data for %s decompression", + self->name); + return (ARCHIVE_FATAL); + } + + self->data = state; + state->out_block_size = out_block_size; + state->out_block = out_block; + self->read = compress_filter_read; + self->skip = NULL; /* not supported */ + self->close = compress_filter_close; + + /* XXX MOVE THE FOLLOWING OUT OF INIT() XXX */ + + (void)getbits(self, 8); /* Skip first signature byte. */ + (void)getbits(self, 8); /* Skip second signature byte. */ + + code = getbits(self, 8); + state->maxcode_bits = code & 0x1f; + state->maxcode = (1 << state->maxcode_bits); + state->use_reset_code = code & 0x80; + + /* Initialize decompressor. */ + state->free_ent = 256; + state->stackp = state->stack; + if (state->use_reset_code) + state->free_ent++; + state->bits = 9; + state->section_end_code = (1<bits) - 1; + state->oldcode = -1; + for (code = 255; code >= 0; code--) { + state->prefix[code] = 0; + state->suffix[code] = code; + } + next_code(self); + + return (ARCHIVE_OK); +} + +/* + * Return a block of data from the decompression buffer. Decompress more + * as necessary. + */ +static ssize_t +compress_filter_read(struct archive_read_filter *self, const void **pblock) +{ + struct private_data *state; + unsigned char *p, *start, *end; + int ret; + + state = (struct private_data *)self->data; + if (state->end_of_stream) { + *pblock = NULL; + return (0); + } + p = start = (unsigned char *)state->out_block; + end = start + state->out_block_size; + + while (p < end && !state->end_of_stream) { + if (state->stackp > state->stack) { + *p++ = *--state->stackp; + } else { + ret = next_code(self); + if (ret == -1) + state->end_of_stream = ret; + else if (ret != ARCHIVE_OK) + return (ret); + } + } + + *pblock = start; + return (p - start); +} + +/* + * Clean up the reader. + */ +static int +compress_bidder_free(struct archive_read_filter_bidder *self) +{ + self->data = NULL; + return (ARCHIVE_OK); +} + +/* + * Close and release the filter. + */ +static int +compress_filter_close(struct archive_read_filter *self) +{ + struct private_data *state = (struct private_data *)self->data; + + free(state->out_block); + free(state); + return (ARCHIVE_OK); +} + +/* + * Process the next code and fill the stack with the expansion + * of the code. Returns ARCHIVE_FATAL if there is a fatal I/O or + * format error, ARCHIVE_EOF if we hit end of data, ARCHIVE_OK otherwise. + */ +static int +next_code(struct archive_read_filter *self) +{ + struct private_data *state = (struct private_data *)self->data; + int code, newcode; + + static int debug_buff[1024]; + static unsigned debug_index; + + code = newcode = getbits(self, state->bits); + if (code < 0) + return (code); + + debug_buff[debug_index++] = code; + if (debug_index >= sizeof(debug_buff)/sizeof(debug_buff[0])) + debug_index = 0; + + /* If it's a reset code, reset the dictionary. */ + if ((code == 256) && state->use_reset_code) { + /* + * The original 'compress' implementation blocked its + * I/O in a manner that resulted in junk bytes being + * inserted after every reset. The next section skips + * this junk. (Yes, the number of *bytes* to skip is + * a function of the current *bit* length.) + */ + int skip_bytes = state->bits - + (state->bytes_in_section % state->bits); + skip_bytes %= state->bits; + state->bits_avail = 0; /* Discard rest of this byte. */ + while (skip_bytes-- > 0) { + code = getbits(self, 8); + if (code < 0) + return (code); + } + /* Now, actually do the reset. */ + state->bytes_in_section = 0; + state->bits = 9; + state->section_end_code = (1 << state->bits) - 1; + state->free_ent = 257; + state->oldcode = -1; + return (next_code(self)); + } + + if (code > state->free_ent) { + /* An invalid code is a fatal error. */ + archive_set_error(&(self->archive->archive), -1, + "Invalid compressed data"); + return (ARCHIVE_FATAL); + } + + /* Special case for KwKwK string. */ + if (code >= state->free_ent) { + *state->stackp++ = state->finbyte; + code = state->oldcode; + } + + /* Generate output characters in reverse order. */ + while (code >= 256) { + *state->stackp++ = state->suffix[code]; + code = state->prefix[code]; + } + *state->stackp++ = state->finbyte = code; + + /* Generate the new entry. */ + code = state->free_ent; + if (code < state->maxcode && state->oldcode >= 0) { + state->prefix[code] = state->oldcode; + state->suffix[code] = state->finbyte; + ++state->free_ent; + } + if (state->free_ent > state->section_end_code) { + state->bits++; + state->bytes_in_section = 0; + if (state->bits == state->maxcode_bits) + state->section_end_code = state->maxcode; + else + state->section_end_code = (1 << state->bits) - 1; + } + + /* Remember previous code. */ + state->oldcode = newcode; + return (ARCHIVE_OK); +} + +/* + * Return next 'n' bits from stream. + * + * -1 indicates end of available data. + */ +static int +getbits(struct archive_read_filter *self, int n) +{ + struct private_data *state = (struct private_data *)self->data; + int code; + ssize_t ret; + static const int mask[] = { + 0x00, 0x01, 0x03, 0x07, 0x0f, 0x1f, 0x3f, 0x7f, 0xff, + 0x1ff, 0x3ff, 0x7ff, 0xfff, 0x1fff, 0x3fff, 0x7fff, 0xffff + }; + + while (state->bits_avail < n) { + if (state->avail_in <= 0) { + state->next_in + = __archive_read_filter_ahead(self->upstream, + 1, &ret); + if (ret == 0) + return (-1); + if (ret < 0 || state->next_in == NULL) + return (ARCHIVE_FATAL); + state->avail_in = ret; + __archive_read_filter_consume(self->upstream, ret); + } + state->bit_buffer |= *state->next_in++ << state->bits_avail; + state->avail_in--; + state->bits_avail += 8; + state->bytes_in_section++; + } + + code = state->bit_buffer; + state->bit_buffer >>= n; + state->bits_avail -= n; + + return (code & mask[n]); +} diff --git a/lib/libarchive/archive_read_support_compression_gzip.c b/lib/libarchive/archive_read_support_compression_gzip.c new file mode 100644 index 000000000..c8af1eef6 --- /dev/null +++ b/lib/libarchive/archive_read_support_compression_gzip.c @@ -0,0 +1,469 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" + +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_support_compression_gzip.c 201082 2009-12-28 02:05:28Z kientzle $"); + + +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#ifdef HAVE_UNISTD_H +#include +#endif +#ifdef HAVE_ZLIB_H +#include +#endif + +#include "archive.h" +#include "archive_private.h" +#include "archive_read_private.h" + +#ifdef HAVE_ZLIB_H +struct private_data { + z_stream stream; + char in_stream; + unsigned char *out_block; + size_t out_block_size; +#ifndef __minix + int64_t total_out; +#else + int32_t total_out; +#endif + unsigned long crc; + char eof; /* True = found end of compressed data. */ +}; + +/* Gzip Filter. */ +static ssize_t gzip_filter_read(struct archive_read_filter *, const void **); +static int gzip_filter_close(struct archive_read_filter *); +#endif + +/* + * Note that we can detect gzip archives even if we can't decompress + * them. (In fact, we like detecting them because we can give better + * error messages.) So the bid framework here gets compiled even + * if zlib is unavailable. + * + * TODO: If zlib is unavailable, gzip_bidder_init() should + * use the compress_program framework to try to fire up an external + * gunzip program. + */ +static int gzip_bidder_bid(struct archive_read_filter_bidder *, + struct archive_read_filter *); +static int gzip_bidder_init(struct archive_read_filter *); + +int +archive_read_support_compression_gzip(struct archive *_a) +{ + struct archive_read *a = (struct archive_read *)_a; + struct archive_read_filter_bidder *bidder = __archive_read_get_bidder(a); + + if (bidder == NULL) + return (ARCHIVE_FATAL); + + bidder->data = NULL; + bidder->bid = gzip_bidder_bid; + bidder->init = gzip_bidder_init; + bidder->options = NULL; + bidder->free = NULL; /* No data, so no cleanup necessary. */ + /* Signal the extent of gzip support with the return value here. */ +#if HAVE_ZLIB_H + return (ARCHIVE_OK); +#else + archive_set_error(_a, ARCHIVE_ERRNO_MISC, + "Using external gunzip program"); + return (ARCHIVE_WARN); +#endif +} + +/* + * Read and verify the header. + * + * Returns zero if the header couldn't be validated, else returns + * number of bytes in header. If pbits is non-NULL, it receives a + * count of bits verified, suitable for use by bidder. + */ +static int +peek_at_header(struct archive_read_filter *filter, int *pbits) +{ + const unsigned char *p; + ssize_t avail, len; + int bits = 0; + int header_flags; + + /* Start by looking at the first ten bytes of the header, which + * is all fixed layout. */ + len = 10; + p = __archive_read_filter_ahead(filter, len, &avail); + if (p == NULL || avail == 0) + return (0); + if (p[0] != 037) + return (0); + bits += 8; + if (p[1] != 0213) + return (0); + bits += 8; + if (p[2] != 8) /* We only support deflation. */ + return (0); + bits += 8; + if ((p[3] & 0xE0)!= 0) /* No reserved flags set. */ + return (0); + bits += 3; + header_flags = p[3]; + /* Bytes 4-7 are mod time. */ + /* Byte 8 is deflate flags. */ + /* XXXX TODO: return deflate flags back to consume_header for use + in initializing the decompressor. */ + /* Byte 9 is OS. */ + + /* Optional extra data: 2 byte length plus variable body. */ + if (header_flags & 4) { + p = __archive_read_filter_ahead(filter, len + 2, &avail); + if (p == NULL) + return (0); + len += ((int)p[len + 1] << 8) | (int)p[len]; + len += 2; + } + + /* Null-terminated optional filename. */ + if (header_flags & 8) { + do { + ++len; + if (avail < len) + p = __archive_read_filter_ahead(filter, + len, &avail); + if (p == NULL) + return (0); + } while (p[len - 1] != 0); + } + + /* Null-terminated optional comment. */ + if (header_flags & 16) { + do { + ++len; + if (avail < len) + p = __archive_read_filter_ahead(filter, + len, &avail); + if (p == NULL) + return (0); + } while (p[len - 1] != 0); + } + + /* Optional header CRC */ + if ((header_flags & 2)) { + p = __archive_read_filter_ahead(filter, len + 2, &avail); + if (p == NULL) + return (0); +#if 0 + int hcrc = ((int)p[len + 1] << 8) | (int)p[len]; + int crc = /* XXX TODO: Compute header CRC. */; + if (crc != hcrc) + return (0); + bits += 16; +#endif + len += 2; + } + + if (pbits != NULL) + *pbits = bits; + return (len); +} + +/* + * Bidder just verifies the header and returns the number of verified bits. + */ +static int +gzip_bidder_bid(struct archive_read_filter_bidder *self, + struct archive_read_filter *filter) +{ + int bits_checked; + + (void)self; /* UNUSED */ + + if (peek_at_header(filter, &bits_checked)) + return (bits_checked); + return (0); +} + + +#ifndef HAVE_ZLIB_H + +/* + * If we don't have the library on this system, we can't do the + * decompression directly. We can, however, try to run gunzip + * in case that's available. + */ +static int +gzip_bidder_init(struct archive_read_filter *self) +{ + int r; + + r = __archive_read_program(self, "gunzip"); + /* Note: We set the format here even if __archive_read_program() + * above fails. We do, after all, know what the format is + * even if we weren't able to read it. */ + self->code = ARCHIVE_COMPRESSION_GZIP; + self->name = "gzip"; + return (r); +} + +#else + +/* + * Initialize the filter object. + */ +static int +gzip_bidder_init(struct archive_read_filter *self) +{ + struct private_data *state; + static const size_t out_block_size = 64 * 1024; + void *out_block; + + self->code = ARCHIVE_COMPRESSION_GZIP; + self->name = "gzip"; + + state = (struct private_data *)calloc(sizeof(*state), 1); + out_block = (unsigned char *)malloc(out_block_size); + if (state == NULL || out_block == NULL) { + free(out_block); + free(state); + archive_set_error(&self->archive->archive, ENOMEM, + "Can't allocate data for gzip decompression"); + return (ARCHIVE_FATAL); + } + + self->data = state; + state->out_block_size = out_block_size; + state->out_block = out_block; + self->read = gzip_filter_read; + self->skip = NULL; /* not supported */ + self->close = gzip_filter_close; + + state->in_stream = 0; /* We're not actually within a stream yet. */ + + return (ARCHIVE_OK); +} + +static int +consume_header(struct archive_read_filter *self) +{ + struct private_data *state; + ssize_t avail; + size_t len; + int ret; + + state = (struct private_data *)self->data; + + /* If this is a real header, consume it. */ + len = peek_at_header(self->upstream, NULL); + if (len == 0) + return (ARCHIVE_EOF); + __archive_read_filter_consume(self->upstream, len); + + /* Initialize CRC accumulator. */ + state->crc = crc32(0L, NULL, 0); + + /* Initialize compression library. */ + state->stream.next_in = (unsigned char *)(uintptr_t) + __archive_read_filter_ahead(self->upstream, 1, &avail); + state->stream.avail_in = avail; + ret = inflateInit2(&(state->stream), + -15 /* Don't check for zlib header */); + + /* Decipher the error code. */ + switch (ret) { + case Z_OK: + state->in_stream = 1; + return (ARCHIVE_OK); + case Z_STREAM_ERROR: + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "Internal error initializing compression library: " + "invalid setup parameter"); + break; + case Z_MEM_ERROR: + archive_set_error(&self->archive->archive, ENOMEM, + "Internal error initializing compression library: " + "out of memory"); + break; + case Z_VERSION_ERROR: + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "Internal error initializing compression library: " + "invalid library version"); + break; + default: + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "Internal error initializing compression library: " + " Zlib error %d", ret); + break; + } + return (ARCHIVE_FATAL); +} + +static int +consume_trailer(struct archive_read_filter *self) +{ + struct private_data *state; + const unsigned char *p; + ssize_t avail; + + state = (struct private_data *)self->data; + + state->in_stream = 0; + switch (inflateEnd(&(state->stream))) { + case Z_OK: + break; + default: + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "Failed to clean up gzip decompressor"); + return (ARCHIVE_FATAL); + } + + /* GZip trailer is a fixed 8 byte structure. */ + p = __archive_read_filter_ahead(self->upstream, 8, &avail); + if (p == NULL || avail == 0) + return (ARCHIVE_FATAL); + + /* XXX TODO: Verify the length and CRC. */ + + /* We've verified the trailer, so consume it now. */ + __archive_read_filter_consume(self->upstream, 8); + + return (ARCHIVE_OK); +} + +static ssize_t +gzip_filter_read(struct archive_read_filter *self, const void **p) +{ + struct private_data *state; + size_t decompressed; + ssize_t avail_in; + int ret; + + state = (struct private_data *)self->data; + + /* Empty our output buffer. */ + state->stream.next_out = state->out_block; + state->stream.avail_out = state->out_block_size; + + /* Try to fill the output buffer. */ + while (state->stream.avail_out > 0 && !state->eof) { + /* If we're not in a stream, read a header + * and initialize the decompression library. */ + if (!state->in_stream) { + ret = consume_header(self); + if (ret == ARCHIVE_EOF) { + state->eof = 1; + break; + } + if (ret < ARCHIVE_OK) + return (ret); + } + + /* Peek at the next available data. */ + /* ZLib treats stream.next_in as const but doesn't declare + * it so, hence this ugly cast. */ + state->stream.next_in = (unsigned char *)(uintptr_t) + __archive_read_filter_ahead(self->upstream, 1, &avail_in); + if (state->stream.next_in == NULL) + return (ARCHIVE_FATAL); + state->stream.avail_in = avail_in; + + /* Decompress and consume some of that data. */ + ret = inflate(&(state->stream), 0); + switch (ret) { + case Z_OK: /* Decompressor made some progress. */ + __archive_read_filter_consume(self->upstream, + avail_in - state->stream.avail_in); + break; + case Z_STREAM_END: /* Found end of stream. */ + __archive_read_filter_consume(self->upstream, + avail_in - state->stream.avail_in); + /* Consume the stream trailer; release the + * decompression library. */ + ret = consume_trailer(self); + if (ret < ARCHIVE_OK) + return (ret); + break; + default: + /* Return an error. */ + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "gzip decompression failed"); + return (ARCHIVE_FATAL); + } + } + + /* We've read as much as we can. */ + decompressed = state->stream.next_out - state->out_block; + state->total_out += decompressed; + if (decompressed == 0) + *p = NULL; + else + *p = state->out_block; + return (decompressed); +} + +/* + * Clean up the decompressor. + */ +static int +gzip_filter_close(struct archive_read_filter *self) +{ + struct private_data *state; + int ret; + + state = (struct private_data *)self->data; + ret = ARCHIVE_OK; + + if (state->in_stream) { + switch (inflateEnd(&(state->stream))) { + case Z_OK: + break; + default: + archive_set_error(&(self->archive->archive), + ARCHIVE_ERRNO_MISC, + "Failed to clean up gzip compressor"); + ret = ARCHIVE_FATAL; + } + } + + free(state->out_block); + free(state); + return (ret); +} + +#endif /* HAVE_ZLIB_H */ diff --git a/lib/libarchive/archive_read_support_compression_none.c b/lib/libarchive/archive_read_support_compression_none.c new file mode 100644 index 000000000..955d06d9a --- /dev/null +++ b/lib/libarchive/archive_read_support_compression_none.c @@ -0,0 +1,40 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_support_compression_none.c 185679 2008-12-06 06:45:15Z kientzle $"); + +#include "archive.h" + +/* + * Uncompressed streams are handled implicitly by the read core, + * so this is now a no-op. + */ +int +archive_read_support_compression_none(struct archive *a) +{ + (void)a; /* UNUSED */ + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_read_support_compression_program.c b/lib/libarchive/archive_read_support_compression_program.c new file mode 100644 index 000000000..0c63f2e83 --- /dev/null +++ b/lib/libarchive/archive_read_support_compression_program.c @@ -0,0 +1,459 @@ +/*- + * Copyright (c) 2007 Joerg Sonnenberger + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_support_compression_program.c 201112 2009-12-28 06:59:35Z kientzle $"); + +#ifdef HAVE_SYS_WAIT_H +# include +#endif +#ifdef HAVE_ERRNO_H +# include +#endif +#ifdef HAVE_FCNTL_H +# include +#endif +#ifdef HAVE_LIMITS_H +# include +#endif +#ifdef HAVE_SIGNAL_H +# include +#endif +#ifdef HAVE_STDLIB_H +# include +#endif +#ifdef HAVE_STRING_H +# include +#endif +#ifdef HAVE_UNISTD_H +# include +#endif + +#include "archive.h" +#include "archive_private.h" +#include "archive_read_private.h" + +int +archive_read_support_compression_program(struct archive *a, const char *cmd) +{ + return (archive_read_support_compression_program_signature(a, cmd, NULL, 0)); +} + + +/* This capability is only available on POSIX systems. */ +#if (!defined(HAVE_PIPE) || !defined(HAVE_FCNTL) || \ + !(defined(HAVE_FORK) || defined(HAVE_VFORK))) && (!defined(_WIN32) || defined(__CYGWIN__)) + +/* + * On non-Posix systems, allow the program to build, but choke if + * this function is actually invoked. + */ +int +archive_read_support_compression_program_signature(struct archive *_a, + const char *cmd, void *signature, size_t signature_len) +{ + (void)_a; /* UNUSED */ + (void)cmd; /* UNUSED */ + (void)signature; /* UNUSED */ + (void)signature_len; /* UNUSED */ + + archive_set_error(_a, -1, + "External compression programs not supported on this platform"); + return (ARCHIVE_FATAL); +} + +int +__archive_read_program(struct archive_read_filter *self, const char *cmd) +{ + (void)self; /* UNUSED */ + (void)cmd; /* UNUSED */ + + archive_set_error(&self->archive->archive, -1, + "External compression programs not supported on this platform"); + return (ARCHIVE_FATAL); +} + +#else + +#include "filter_fork.h" + +/* + * The bidder object stores the command and the signature to watch for. + * The 'inhibit' entry here is used to ensure that unchecked filters never + * bid twice in the same pipeline. + */ +struct program_bidder { + char *cmd; + void *signature; + size_t signature_len; + int inhibit; +}; + +static int program_bidder_bid(struct archive_read_filter_bidder *, + struct archive_read_filter *upstream); +static int program_bidder_init(struct archive_read_filter *); +static int program_bidder_free(struct archive_read_filter_bidder *); + +/* + * The actual filter needs to track input and output data. + */ +struct program_filter { + char *description; + pid_t child; + int exit_status; + int waitpid_return; + int child_stdin, child_stdout; + + char *out_buf; + size_t out_buf_len; +}; + +static ssize_t program_filter_read(struct archive_read_filter *, + const void **); +static int program_filter_close(struct archive_read_filter *); + +int +archive_read_support_compression_program_signature(struct archive *_a, + const char *cmd, const void *signature, size_t signature_len) +{ + struct archive_read *a = (struct archive_read *)_a; + struct archive_read_filter_bidder *bidder; + struct program_bidder *state; + + /* + * Get a bidder object from the read core. + */ + bidder = __archive_read_get_bidder(a); + if (bidder == NULL) + return (ARCHIVE_FATAL); + + /* + * Allocate our private state. + */ + state = (struct program_bidder *)calloc(sizeof (*state), 1); + if (state == NULL) + return (ARCHIVE_FATAL); + state->cmd = strdup(cmd); + if (signature != NULL && signature_len > 0) { + state->signature_len = signature_len; + state->signature = malloc(signature_len); + memcpy(state->signature, signature, signature_len); + } + + /* + * Fill in the bidder object. + */ + bidder->data = state; + bidder->bid = program_bidder_bid; + bidder->init = program_bidder_init; + bidder->options = NULL; + bidder->free = program_bidder_free; + return (ARCHIVE_OK); +} + +static int +program_bidder_free(struct archive_read_filter_bidder *self) +{ + struct program_bidder *state = (struct program_bidder *)self->data; + free(state->cmd); + free(state->signature); + free(self->data); + return (ARCHIVE_OK); +} + +/* + * If we do have a signature, bid only if that matches. + * + * If there's no signature, we bid INT_MAX the first time + * we're called, then never bid again. + */ +static int +program_bidder_bid(struct archive_read_filter_bidder *self, + struct archive_read_filter *upstream) +{ + struct program_bidder *state = self->data; + const char *p; + + /* If we have a signature, use that to match. */ + if (state->signature_len > 0) { + p = __archive_read_filter_ahead(upstream, + state->signature_len, NULL); + if (p == NULL) + return (0); + /* No match, so don't bid. */ + if (memcmp(p, state->signature, state->signature_len) != 0) + return (0); + return ((int)state->signature_len * 8); + } + + /* Otherwise, bid once and then never bid again. */ + if (state->inhibit) + return (0); + state->inhibit = 1; + return (INT_MAX); +} + +/* + * Shut down the child, return ARCHIVE_OK if it exited normally. + * + * Note that the return value is sticky; if we're called again, + * we won't reap the child again, but we will return the same status + * (including error message if the child came to a bad end). + */ +static int +child_stop(struct archive_read_filter *self, struct program_filter *state) +{ + /* Close our side of the I/O with the child. */ + if (state->child_stdin != -1) { + close(state->child_stdin); + state->child_stdin = -1; + } + if (state->child_stdout != -1) { + close(state->child_stdout); + state->child_stdout = -1; + } + + if (state->child != 0) { + /* Reap the child. */ + do { + state->waitpid_return + = waitpid(state->child, &state->exit_status, 0); + } while (state->waitpid_return == -1 && errno == EINTR); + state->child = 0; + } + + if (state->waitpid_return < 0) { + /* waitpid() failed? This is ugly. */ + archive_set_error(&self->archive->archive, ARCHIVE_ERRNO_MISC, + "Child process exited badly"); + return (ARCHIVE_WARN); + } + +#if !defined(_WIN32) || defined(__CYGWIN__) + if (WIFSIGNALED(state->exit_status)) { +#ifdef SIGPIPE + /* If the child died because we stopped reading before + * it was done, that's okay. Some archive formats + * have padding at the end that we routinely ignore. */ + /* The alternative to this would be to add a step + * before close(child_stdout) above to read from the + * child until the child has no more to write. */ + if (WTERMSIG(state->exit_status) == SIGPIPE) + return (ARCHIVE_OK); +#endif + archive_set_error(&self->archive->archive, ARCHIVE_ERRNO_MISC, + "Child process exited with signal %d", + WTERMSIG(state->exit_status)); + return (ARCHIVE_WARN); + } +#endif /* !_WIN32 || __CYGWIN__ */ + + if (WIFEXITED(state->exit_status)) { + if (WEXITSTATUS(state->exit_status) == 0) + return (ARCHIVE_OK); + + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "Child process exited with status %d", + WEXITSTATUS(state->exit_status)); + return (ARCHIVE_WARN); + } + + return (ARCHIVE_WARN); +} + +/* + * Use select() to decide whether the child is ready for read or write. + */ +static ssize_t +child_read(struct archive_read_filter *self, char *buf, size_t buf_len) +{ + struct program_filter *state = self->data; + ssize_t ret, requested, avail; + const char *p; + + requested = buf_len > SSIZE_MAX ? SSIZE_MAX : buf_len; + + for (;;) { + do { + ret = read(state->child_stdout, buf, requested); + } while (ret == -1 && errno == EINTR); + + if (ret > 0) + return (ret); + if (ret == 0 || (ret == -1 && errno == EPIPE)) + /* Child has closed its output; reap the child + * and return the status. */ + return (child_stop(self, state)); + if (ret == -1 && errno != EAGAIN) + return (-1); + + if (state->child_stdin == -1) { + /* Block until child has some I/O ready. */ + __archive_check_child(state->child_stdin, + state->child_stdout); + continue; + } + + /* Get some more data from upstream. */ + p = __archive_read_filter_ahead(self->upstream, 1, &avail); + if (p == NULL) { + close(state->child_stdin); + state->child_stdin = -1; + fcntl(state->child_stdout, F_SETFL, 0); + if (avail < 0) + return (avail); + continue; + } + + do { + ret = write(state->child_stdin, p, avail); + } while (ret == -1 && errno == EINTR); + + if (ret > 0) { + /* Consume whatever we managed to write. */ + __archive_read_filter_consume(self->upstream, ret); + } else if (ret == -1 && errno == EAGAIN) { + /* Block until child has some I/O ready. */ + __archive_check_child(state->child_stdin, + state->child_stdout); + } else { + /* Write failed. */ + close(state->child_stdin); + state->child_stdin = -1; + fcntl(state->child_stdout, F_SETFL, 0); + /* If it was a bad error, we're done; otherwise + * it was EPIPE or EOF, and we can still read + * from the child. */ + if (ret == -1 && errno != EPIPE) + return (-1); + } + } +} + +int +__archive_read_program(struct archive_read_filter *self, const char *cmd) +{ + struct program_filter *state; + static const size_t out_buf_len = 65536; + char *out_buf; + char *description; + const char *prefix = "Program: "; + + state = (struct program_filter *)calloc(1, sizeof(*state)); + out_buf = (char *)malloc(out_buf_len); + description = (char *)malloc(strlen(prefix) + strlen(cmd) + 1); + if (state == NULL || out_buf == NULL || description == NULL) { + archive_set_error(&self->archive->archive, ENOMEM, + "Can't allocate input data"); + free(state); + free(out_buf); + free(description); + return (ARCHIVE_FATAL); + } + + self->code = ARCHIVE_COMPRESSION_PROGRAM; + state->description = description; + strcpy(state->description, prefix); + strcat(state->description, cmd); + self->name = state->description; + + state->out_buf = out_buf; + state->out_buf_len = out_buf_len; + + if ((state->child = __archive_create_child(cmd, + &state->child_stdin, &state->child_stdout)) == -1) { + free(state->out_buf); + free(state); + archive_set_error(&self->archive->archive, EINVAL, + "Can't initialise filter"); + return (ARCHIVE_FATAL); + } + + self->data = state; + self->read = program_filter_read; + self->skip = NULL; + self->close = program_filter_close; + + /* XXX Check that we can read at least one byte? */ + return (ARCHIVE_OK); +} + +static int +program_bidder_init(struct archive_read_filter *self) +{ + struct program_bidder *bidder_state; + + bidder_state = (struct program_bidder *)self->bidder->data; + return (__archive_read_program(self, bidder_state->cmd)); +} + +static ssize_t +program_filter_read(struct archive_read_filter *self, const void **buff) +{ + struct program_filter *state; + ssize_t bytes; + size_t total; + char *p; + + state = (struct program_filter *)self->data; + + total = 0; + p = state->out_buf; + while (state->child_stdout != -1 && total < state->out_buf_len) { + bytes = child_read(self, p, state->out_buf_len - total); + if (bytes < 0) + /* No recovery is possible if we can no longer + * read from the child. */ + return (ARCHIVE_FATAL); + if (bytes == 0) + /* We got EOF from the child. */ + break; + total += bytes; + p += bytes; + } + + *buff = state->out_buf; + return (total); +} + +static int +program_filter_close(struct archive_read_filter *self) +{ + struct program_filter *state; + int e; + + state = (struct program_filter *)self->data; + e = child_stop(self, state); + + /* Release our private data. */ + free(state->out_buf); + free(state->description); + free(state); + + return (e); +} + +#endif /* !defined(HAVE_PIPE) || !defined(HAVE_VFORK) || !defined(HAVE_FCNTL) */ diff --git a/lib/libarchive/archive_read_support_compression_uu.c b/lib/libarchive/archive_read_support_compression_uu.c new file mode 100644 index 000000000..df6390a9b --- /dev/null +++ b/lib/libarchive/archive_read_support_compression_uu.c @@ -0,0 +1,631 @@ +/*- + * Copyright (c) 2009 Michihiro NAKAJIMA + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_support_compression_uu.c 201248 2009-12-30 06:12:03Z kientzle $"); + +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif + +#include "archive.h" +#include "archive_private.h" +#include "archive_read_private.h" + +struct uudecode { +#ifndef __minix + int64_t total; +#else + int32_t total; +#endif + unsigned char *in_buff; +#define IN_BUFF_SIZE (1024) + int in_cnt; + size_t in_allocated; + unsigned char *out_buff; +#define OUT_BUFF_SIZE (64 * 1024) + int state; +#define ST_FIND_HEAD 0 +#define ST_READ_UU 1 +#define ST_UUEND 2 +#define ST_READ_BASE64 3 +}; + +static int uudecode_bidder_bid(struct archive_read_filter_bidder *, + struct archive_read_filter *filter); +static int uudecode_bidder_init(struct archive_read_filter *); + +static ssize_t uudecode_filter_read(struct archive_read_filter *, + const void **); +static int uudecode_filter_close(struct archive_read_filter *); + +int +archive_read_support_compression_uu(struct archive *_a) +{ + struct archive_read *a = (struct archive_read *)_a; + struct archive_read_filter_bidder *bidder; + + bidder = __archive_read_get_bidder(a); + archive_clear_error(_a); + if (bidder == NULL) + return (ARCHIVE_FATAL); + + bidder->data = NULL; + bidder->bid = uudecode_bidder_bid; + bidder->init = uudecode_bidder_init; + bidder->options = NULL; + bidder->free = NULL; + return (ARCHIVE_OK); +} + +static const unsigned char ascii[256] = { + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, '\n', 0, 0, '\r', 0, 0, /* 00 - 0F */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 10 - 1F */ + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 20 - 2F */ + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 30 - 3F */ + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 40 - 4F */ + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 50 - 5F */ + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 60 - 6F */ + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, /* 70 - 7F */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 80 - 8F */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 90 - 9F */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* A0 - AF */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* B0 - BF */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* C0 - CF */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* D0 - DF */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* E0 - EF */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* F0 - FF */ +}; + +static const unsigned char uuchar[256] = { + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 00 - 0F */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 10 - 1F */ + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 20 - 2F */ + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 30 - 3F */ + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 40 - 4F */ + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 50 - 5F */ + 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 60 - 6F */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 70 - 7F */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 80 - 8F */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 90 - 9F */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* A0 - AF */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* B0 - BF */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* C0 - CF */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* D0 - DF */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* E0 - EF */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* F0 - FF */ +}; + +static const unsigned char base64[256] = { + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 00 - 0F */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 10 - 1F */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, /* 20 - 2F */ + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, /* 30 - 3F */ + 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 40 - 4F */ + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, /* 50 - 5F */ + 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 60 - 6F */ + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, /* 70 - 7F */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 80 - 8F */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 90 - 9F */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* A0 - AF */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* B0 - BF */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* C0 - CF */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* D0 - DF */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* E0 - EF */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* F0 - FF */ +}; + +static const int base64num[128] = { + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, /* 00 - 0F */ + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, /* 10 - 1F */ + 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 62, 0, 0, 0, 63, /* 20 - 2F */ + 52, 53, 54, 55, 56, 57, 58, 59, + 60, 61, 0, 0, 0, 0, 0, 0, /* 30 - 3F */ + 0, 0, 1, 2, 3, 4, 5, 6, + 7, 8, 9, 10, 11, 12, 13, 14, /* 40 - 4F */ + 15, 16, 17, 18, 19, 20, 21, 22, + 23, 24, 25, 0, 0, 0, 0, 0, /* 50 - 5F */ + 0, 26, 27, 28, 29, 30, 31, 32, + 33, 34, 35, 36, 37, 38, 39, 40, /* 60 - 6F */ + 41, 42, 43, 44, 45, 46, 47, 48, + 49, 50, 51, 0, 0, 0, 0, 0, /* 70 - 7F */ +}; + +static ssize_t +get_line(const unsigned char *b, ssize_t avail, ssize_t *nlsize) +{ + ssize_t len; + + len = 0; + while (len < avail) { + switch (ascii[*b]) { + case 0: /* Non-ascii character or control character. */ + if (nlsize != NULL) + *nlsize = 0; + return (-1); + case '\r': + if (avail-len > 1 && b[1] == '\n') { + if (nlsize != NULL) + *nlsize = 2; + return (len+2); + } + /* FALL THROUGH */ + case '\n': + if (nlsize != NULL) + *nlsize = 1; + return (len+1); + case 1: + b++; + len++; + break; + } + } + if (nlsize != NULL) + *nlsize = 0; + return (avail); +} + +static ssize_t +bid_get_line(struct archive_read_filter *filter, + const unsigned char **b, ssize_t *avail, ssize_t *ravail, ssize_t *nl) +{ + ssize_t len; + int quit; + + quit = 0; + if (*avail == 0) { + *nl = 0; + len = 0; + } else + len = get_line(*b, *avail, nl); + /* + * Read bytes more while it does not reach the end of line. + */ + while (*nl == 0 && len == *avail && !quit) { + ssize_t diff = *ravail - *avail; + + *b = __archive_read_filter_ahead(filter, 160 + *ravail, avail); + if (*b == NULL) { + if (*ravail >= *avail) + return (0); + /* Reading bytes reaches the end of file. */ + *b = __archive_read_filter_ahead(filter, *avail, avail); + quit = 1; + } + *ravail = *avail; + *b += diff; + *avail -= diff; + len = get_line(*b, *avail, nl); + } + return (len); +} + +#define UUDECODE(c) (((c) - 0x20) & 0x3f) + +static int +uudecode_bidder_bid(struct archive_read_filter_bidder *self, + struct archive_read_filter *filter) +{ + const unsigned char *b; + ssize_t avail, ravail; + ssize_t len, nl; + int l; + int firstline; + + (void)self; /* UNUSED */ + + b = __archive_read_filter_ahead(filter, 1, &avail); + if (b == NULL) + return (0); + + firstline = 20; + ravail = avail; + for (;;) { + len = bid_get_line(filter, &b, &avail, &ravail, &nl); + if (len < 0 || nl == 0) + return (0);/* Binary data. */ + if (memcmp(b, "begin ", 6) == 0 && len - nl >= 11) + l = 6; + else if (memcmp(b, "begin-base64 ", 13) == 0 && len - nl >= 18) + l = 13; + else + l = 0; + + if (l > 0 && (b[l] < '0' || b[l] > '7' || + b[l+1] < '0' || b[l+1] > '7' || + b[l+2] < '0' || b[l+2] > '7' || b[l+3] != ' ')) + l = 0; + + b += len; + avail -= len; + if (l) + break; + firstline = 0; + } + if (!avail) + return (0); + len = bid_get_line(filter, &b, &avail, &ravail, &nl); + if (len < 0 || nl == 0) + return (0);/* There are non-ascii characters. */ + avail -= len; + + if (l == 6) { + if (!uuchar[*b]) + return (0); + /* Get a length of decoded bytes. */ + l = UUDECODE(*b++); len--; + if (l > 45) + /* Normally, maximum length is 45(character 'M'). */ + return (0); + while (l && len-nl > 0) { + if (l > 0) { + if (!uuchar[*b++]) + return (0); + if (!uuchar[*b++]) + return (0); + len -= 2; + --l; + } + if (l > 0) { + if (!uuchar[*b++]) + return (0); + --len; + --l; + } + if (l > 0) { + if (!uuchar[*b++]) + return (0); + --len; + --l; + } + } + if (len-nl < 0) + return (0); + if (len-nl == 1 && + (uuchar[*b] || /* Check sum. */ + (*b >= 'a' && *b <= 'z'))) {/* Padding data(MINIX). */ + ++b; + --len; + } + b += nl; + if (avail && uuchar[*b]) + return (firstline+30); + } + if (l == 13) { + while (len-nl > 0) { + if (!base64[*b++]) + return (0); + --len; + } + b += nl; + + if (avail >= 5 && memcmp(b, "====\n", 5) == 0) + return (firstline+40); + if (avail >= 6 && memcmp(b, "====\r\n", 6) == 0) + return (firstline+40); + if (avail > 0 && base64[*b]) + return (firstline+30); + } + + return (0); +} + +static int +uudecode_bidder_init(struct archive_read_filter *self) +{ + struct uudecode *uudecode; + void *out_buff; + void *in_buff; + + self->code = ARCHIVE_COMPRESSION_UU; + self->name = "uu"; + self->read = uudecode_filter_read; + self->skip = NULL; /* not supported */ + self->close = uudecode_filter_close; + + uudecode = (struct uudecode *)calloc(sizeof(*uudecode), 1); + out_buff = malloc(OUT_BUFF_SIZE); + in_buff = malloc(IN_BUFF_SIZE); + if (uudecode == NULL || out_buff == NULL || in_buff == NULL) { + archive_set_error(&self->archive->archive, ENOMEM, + "Can't allocate data for uudecode"); + free(uudecode); + free(out_buff); + free(in_buff); + return (ARCHIVE_FATAL); + } + + self->data = uudecode; + uudecode->in_buff = in_buff; + uudecode->in_cnt = 0; + uudecode->in_allocated = IN_BUFF_SIZE; + uudecode->out_buff = out_buff; + uudecode->state = ST_FIND_HEAD; + + return (ARCHIVE_OK); +} + +static int +ensure_in_buff_size(struct archive_read_filter *self, + struct uudecode *uudecode, size_t size) +{ + + if (size > uudecode->in_allocated) { + unsigned char *ptr; + size_t newsize; + + newsize = uudecode->in_allocated << 1; + ptr = malloc(newsize); + if (ptr == NULL || + newsize < uudecode->in_allocated) { + free(ptr); + archive_set_error(&self->archive->archive, + ENOMEM, + "Can't allocate data for uudecode"); + return (ARCHIVE_FATAL); + } + if (uudecode->in_cnt) + memmove(ptr, uudecode->in_buff, + uudecode->in_cnt); + free(uudecode->in_buff); + uudecode->in_buff = ptr; + uudecode->in_allocated = newsize; + } + return (ARCHIVE_OK); +} + +static ssize_t +uudecode_filter_read(struct archive_read_filter *self, const void **buff) +{ + struct uudecode *uudecode; + const unsigned char *b, *d; + unsigned char *out; + ssize_t avail_in, ravail; + ssize_t used; + ssize_t total; + ssize_t len, llen, nl; + + uudecode = (struct uudecode *)self->data; + +read_more: + d = __archive_read_filter_ahead(self->upstream, 1, &avail_in); + if (d == NULL && avail_in < 0) + return (ARCHIVE_FATAL); + /* Quiet a code analyzer; make sure avail_in must be zero + * when d is NULL. */ + if (d == NULL) + avail_in = 0; + used = 0; + total = 0; + out = uudecode->out_buff; + ravail = avail_in; + if (uudecode->in_cnt) { + /* + * If there is remaining data which is saved by + * previous calling, use it first. + */ + if (ensure_in_buff_size(self, uudecode, + avail_in + uudecode->in_cnt) != ARCHIVE_OK) + return (ARCHIVE_FATAL); + memcpy(uudecode->in_buff + uudecode->in_cnt, + d, avail_in); + d = uudecode->in_buff; + avail_in += uudecode->in_cnt; + uudecode->in_cnt = 0; + } + for (;used < avail_in; d += llen, used += llen) { + int l, body; + + b = d; + len = get_line(b, avail_in - used, &nl); + if (len < 0) { + /* Non-ascii character is found. */ + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "Insufficient compressed data"); + return (ARCHIVE_FATAL); + } + llen = len; + if (nl == 0) { + /* + * Save remaining data which does not contain + * NL('\n','\r'). + */ + if (ensure_in_buff_size(self, uudecode, len) + != ARCHIVE_OK) + return (ARCHIVE_FATAL); + if (uudecode->in_buff != b) + memmove(uudecode->in_buff, b, len); + uudecode->in_cnt = len; + if (total == 0) { + /* Do not return 0; it means end-of-file. + * We should try to read bytes more. */ + __archive_read_filter_consume( + self->upstream, ravail); + goto read_more; + } + break; + } + if (total + len * 2 > OUT_BUFF_SIZE) + break; + switch (uudecode->state) { + default: + case ST_FIND_HEAD: + if (len - nl > 13 && memcmp(b, "begin ", 6) == 0) + l = 6; + else if (len - nl > 18 && + memcmp(b, "begin-base64 ", 13) == 0) + l = 13; + else + l = 0; + if (l != 0 && b[l] >= '0' && b[l] <= '7' && + b[l+1] >= '0' && b[l+1] <= '7' && + b[l+2] >= '0' && b[l+2] <= '7' && b[l+3] == ' ') { + if (l == 6) + uudecode->state = ST_READ_UU; + else + uudecode->state = ST_READ_BASE64; + } + break; + case ST_READ_UU: + body = len - nl; + if (!uuchar[*b] || body <= 0) { + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "Insufficient compressed data"); + return (ARCHIVE_FATAL); + } + /* Get length of undecoded bytes of curent line. */ + l = UUDECODE(*b++); + body--; + if (l > body) { + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "Insufficient compressed data"); + return (ARCHIVE_FATAL); + } + if (l == 0) { + uudecode->state = ST_UUEND; + break; + } + while (l > 0) { + int n = 0; + + if (l > 0) { + if (!uuchar[b[0]] || !uuchar[b[1]]) + break; + n = UUDECODE(*b++) << 18; + n |= UUDECODE(*b++) << 12; + *out++ = n >> 16; total++; + --l; + } + if (l > 0) { + if (!uuchar[b[0]]) + break; + n |= UUDECODE(*b++) << 6; + *out++ = (n >> 8) & 0xFF; total++; + --l; + } + if (l > 0) { + if (!uuchar[b[0]]) + break; + n |= UUDECODE(*b++); + *out++ = n & 0xFF; total++; + --l; + } + } + if (l) { + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "Insufficient compressed data"); + return (ARCHIVE_FATAL); + } + break; + case ST_UUEND: + if (len - nl == 3 && memcmp(b, "end ", 3) == 0) + uudecode->state = ST_FIND_HEAD; + else { + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "Insufficient compressed data"); + return (ARCHIVE_FATAL); + } + break; + case ST_READ_BASE64: + l = len - nl; + if (l >= 3 && b[0] == '=' && b[1] == '=' && + b[2] == '=') { + uudecode->state = ST_FIND_HEAD; + break; + } + while (l > 0) { + int n = 0; + + if (l > 0) { + if (!base64[b[0]] || !base64[b[1]]) + break; + n = base64num[*b++] << 18; + n |= base64num[*b++] << 12; + *out++ = n >> 16; total++; + l -= 2; + } + if (l > 0) { + if (*b == '=') + break; + if (!base64[*b]) + break; + n |= base64num[*b++] << 6; + *out++ = (n >> 8) & 0xFF; total++; + --l; + } + if (l > 0) { + if (*b == '=') + break; + if (!base64[*b]) + break; + n |= base64num[*b++]; + *out++ = n & 0xFF; total++; + --l; + } + } + if (l && *b != '=') { + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "Insufficient compressed data"); + return (ARCHIVE_FATAL); + } + break; + } + } + + __archive_read_filter_consume(self->upstream, ravail); + + *buff = uudecode->out_buff; + uudecode->total += total; + return (total); +} + +static int +uudecode_filter_close(struct archive_read_filter *self) +{ + struct uudecode *uudecode; + + uudecode = (struct uudecode *)self->data; + free(uudecode->in_buff); + free(uudecode->out_buff); + free(uudecode); + + return (ARCHIVE_OK); +} + diff --git a/lib/libarchive/archive_read_support_compression_xz.c b/lib/libarchive/archive_read_support_compression_xz.c new file mode 100644 index 000000000..198d6b96e --- /dev/null +++ b/lib/libarchive/archive_read_support_compression_xz.c @@ -0,0 +1,718 @@ +/*- + * Copyright (c) 2009 Michihiro NAKAJIMA + * Copyright (c) 2003-2008 Tim Kientzle and Miklos Vajna + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" + +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_support_compression_xz.c 201167 2009-12-29 06:06:20Z kientzle $"); + +#ifdef HAVE_ERRNO_H +#include +#endif +#include +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#ifdef HAVE_UNISTD_H +#include +#endif +#if HAVE_LZMA_H +#include +#elif HAVE_LZMADEC_H +#include +#endif + +#include "archive.h" +#include "archive_endian.h" +#include "archive_private.h" +#include "archive_read_private.h" + +#if HAVE_LZMA_H && HAVE_LIBLZMA + +struct private_data { + lzma_stream stream; + unsigned char *out_block; + size_t out_block_size; + int64_t total_out; + char eof; /* True = found end of compressed data. */ +}; + +/* Combined lzma/xz filter */ +static ssize_t xz_filter_read(struct archive_read_filter *, const void **); +static int xz_filter_close(struct archive_read_filter *); +static int xz_lzma_bidder_init(struct archive_read_filter *); + +#elif HAVE_LZMADEC_H && HAVE_LIBLZMADEC + +struct private_data { + lzmadec_stream stream; + unsigned char *out_block; + size_t out_block_size; + int64_t total_out; + char eof; /* True = found end of compressed data. */ +}; + +/* Lzma-only filter */ +static ssize_t lzma_filter_read(struct archive_read_filter *, const void **); +static int lzma_filter_close(struct archive_read_filter *); +#endif + +/* + * Note that we can detect xz and lzma compressed files even if we + * can't decompress them. (In fact, we like detecting them because we + * can give better error messages.) So the bid framework here gets + * compiled even if no lzma library is available. + */ +static int xz_bidder_bid(struct archive_read_filter_bidder *, + struct archive_read_filter *); +static int xz_bidder_init(struct archive_read_filter *); +static int lzma_bidder_bid(struct archive_read_filter_bidder *, + struct archive_read_filter *); +static int lzma_bidder_init(struct archive_read_filter *); + +int +archive_read_support_compression_xz(struct archive *_a) +{ + struct archive_read *a = (struct archive_read *)_a; + struct archive_read_filter_bidder *bidder = __archive_read_get_bidder(a); + + archive_clear_error(_a); + if (bidder == NULL) + return (ARCHIVE_FATAL); + + bidder->data = NULL; + bidder->bid = xz_bidder_bid; + bidder->init = xz_bidder_init; + bidder->options = NULL; + bidder->free = NULL; +#if HAVE_LZMA_H && HAVE_LIBLZMA + return (ARCHIVE_OK); +#else + archive_set_error(_a, ARCHIVE_ERRNO_MISC, + "Using external unxz program for xz decompression"); + return (ARCHIVE_WARN); +#endif +} + +int +archive_read_support_compression_lzma(struct archive *_a) +{ + struct archive_read *a = (struct archive_read *)_a; + struct archive_read_filter_bidder *bidder = __archive_read_get_bidder(a); + + archive_clear_error(_a); + if (bidder == NULL) + return (ARCHIVE_FATAL); + + bidder->data = NULL; + bidder->bid = lzma_bidder_bid; + bidder->init = lzma_bidder_init; + bidder->options = NULL; + bidder->free = NULL; +#if HAVE_LZMA_H && HAVE_LIBLZMA + return (ARCHIVE_OK); +#elif HAVE_LZMADEC_H && HAVE_LIBLZMADEC + return (ARCHIVE_OK); +#else + archive_set_error(_a, ARCHIVE_ERRNO_MISC, + "Using external unlzma program for lzma decompression"); + return (ARCHIVE_WARN); +#endif +} + +/* + * Test whether we can handle this data. + */ +static int +xz_bidder_bid(struct archive_read_filter_bidder *self, + struct archive_read_filter *filter) +{ + const unsigned char *buffer; + ssize_t avail; + int bits_checked; + + (void)self; /* UNUSED */ + + buffer = __archive_read_filter_ahead(filter, 6, &avail); + if (buffer == NULL) + return (0); + + /* + * Verify Header Magic Bytes : FD 37 7A 58 5A 00 + */ + bits_checked = 0; + if (buffer[0] != 0xFD) + return (0); + bits_checked += 8; + if (buffer[1] != 0x37) + return (0); + bits_checked += 8; + if (buffer[2] != 0x7A) + return (0); + bits_checked += 8; + if (buffer[3] != 0x58) + return (0); + bits_checked += 8; + if (buffer[4] != 0x5A) + return (0); + bits_checked += 8; + if (buffer[5] != 0x00) + return (0); + bits_checked += 8; + + return (bits_checked); +} + +/* + * Test whether we can handle this data. + * + * LZMA has a rather poor file signature. Zeros do not + * make good signature bytes as a rule, and the only non-zero byte + * here is an ASCII character. For example, an uncompressed tar + * archive whose first file is ']' would satisfy this check. It may + * be necessary to exclude LZMA from compression_all() because of + * this. Clients of libarchive would then have to explicitly enable + * LZMA checking instead of (or in addition to) compression_all() when + * they have other evidence (file name, command-line option) to go on. + */ +static int +lzma_bidder_bid(struct archive_read_filter_bidder *self, + struct archive_read_filter *filter) +{ + const unsigned char *buffer; + ssize_t avail; + uint32_t dicsize; +#ifndef __minix + uint64_t uncompressed_size; +#else + u64_t uncompressed_size; +#endif + + int bits_checked; + + (void)self; /* UNUSED */ + + buffer = __archive_read_filter_ahead(filter, 14, &avail); + if (buffer == NULL) + return (0); + + /* First byte of raw LZMA stream is commonly 0x5d. + * The first byte is a special number, which consists of + * three parameters of LZMA compression, a number of literal + * context bits(which is from 0 to 8, default is 3), a number + * of literal pos bits(which is from 0 to 4, default is 0), + * a number of pos bits(which is from 0 to 4, default is 2). + * The first byte is made by + * (pos bits * 5 + literal pos bit) * 9 + * literal contest bit, + * and so the default value in this field is + * (2 * 5 + 0) * 9 + 3 = 0x5d. + * lzma of LZMA SDK has options to change those parameters. + * It means a range of this field is from 0 to 224. And lzma of + * XZ Utils with option -e records 0x5e in this field. */ + /* NOTE: If this checking of the first byte increases false + * recognition, we should allow only 0x5d and 0x5e for the first + * byte of LZMA stream. */ + bits_checked = 0; + if (buffer[0] > (4 * 5 + 4) * 9 + 8) + return (0); + /* Most likely value in the first byte of LZMA stream. */ + if (buffer[0] == 0x5d || buffer[0] == 0x5e) + bits_checked += 8; + + /* Sixth through fourteenth bytes are uncompressed size, + * stored in little-endian order. `-1' means uncompressed + * size is unknown and lzma of XZ Utils always records `-1' + * in this field. */ + uncompressed_size = archive_le64dec(buffer+5); +#ifndef __minix + if (uncompressed_size == (uint64_t)ARCHIVE_LITERAL_LL(-1)) + bits_checked += 64; +#else + if (cmp64(uncompressed_size, make64(ULONG_MAX, ULONG_MAX)) == 0) + bits_checked += 64; +#endif + + /* Second through fifth bytes are dictionary size, stored in + * little-endian order. The minimum dictionary size is + * 1 << 12(4KiB) which the lzma of LZMA SDK uses with option + * -d12 and the maxinam dictionary size is 1 << 27(128MiB) + * which the one uses with option -d27. + * NOTE: A comment of LZMA SDK source code says this dictionary + * range is from 1 << 12 to 1 << 30. */ + dicsize = archive_le32dec(buffer+1); + switch (dicsize) { + case 0x00001000:/* lzma of LZMA SDK option -d12. */ + case 0x00002000:/* lzma of LZMA SDK option -d13. */ + case 0x00004000:/* lzma of LZMA SDK option -d14. */ + case 0x00008000:/* lzma of LZMA SDK option -d15. */ + case 0x00010000:/* lzma of XZ Utils option -0 and -1. + * lzma of LZMA SDK option -d16. */ + case 0x00020000:/* lzma of LZMA SDK option -d17. */ + case 0x00040000:/* lzma of LZMA SDK option -d18. */ + case 0x00080000:/* lzma of XZ Utils option -2. + * lzma of LZMA SDK option -d19. */ + case 0x00100000:/* lzma of XZ Utils option -3. + * lzma of LZMA SDK option -d20. */ + case 0x00200000:/* lzma of XZ Utils option -4. + * lzma of LZMA SDK option -d21. */ + case 0x00400000:/* lzma of XZ Utils option -5. + * lzma of LZMA SDK option -d22. */ + case 0x00800000:/* lzma of XZ Utils option -6. + * lzma of LZMA SDK option -d23. */ + case 0x01000000:/* lzma of XZ Utils option -7. + * lzma of LZMA SDK option -d24. */ + case 0x02000000:/* lzma of XZ Utils option -8. + * lzma of LZMA SDK option -d25. */ + case 0x04000000:/* lzma of XZ Utils option -9. + * lzma of LZMA SDK option -d26. */ + case 0x08000000:/* lzma of LZMA SDK option -d27. */ + bits_checked += 32; + break; + default: + /* If a memory usage for encoding was not enough on + * the platform where LZMA stream was made, lzma of + * XZ Utils automatically decreased the dictionary + * size to enough memory for encoding by 1Mi bytes + * (1 << 20).*/ + if (dicsize <= 0x03F00000 && dicsize >= 0x00300000 && + (dicsize & ((1 << 20)-1)) == 0 && + bits_checked == 8 + 64) { + bits_checked += 32; + break; + } + /* Otherwise dictionary size is unlikely. But it is + * possible that someone makes lzma stream with + * liblzma/LZMA SDK in one's dictionary size. */ + return (0); + } + + /* TODO: The above test is still very weak. It would be + * good to do better. */ + + return (bits_checked); +} + +#if HAVE_LZMA_H && HAVE_LIBLZMA + +/* + * liblzma 4.999.7 and later support both lzma and xz streams. + */ +static int +xz_bidder_init(struct archive_read_filter *self) +{ + self->code = ARCHIVE_COMPRESSION_XZ; + self->name = "xz"; + return (xz_lzma_bidder_init(self)); +} + +static int +lzma_bidder_init(struct archive_read_filter *self) +{ + self->code = ARCHIVE_COMPRESSION_LZMA; + self->name = "lzma"; + return (xz_lzma_bidder_init(self)); +} + +/* + * Setup the callbacks. + */ +static int +xz_lzma_bidder_init(struct archive_read_filter *self) +{ + static const size_t out_block_size = 64 * 1024; + void *out_block; + struct private_data *state; + int ret; + + state = (struct private_data *)calloc(sizeof(*state), 1); + out_block = (unsigned char *)malloc(out_block_size); + if (state == NULL || out_block == NULL) { + archive_set_error(&self->archive->archive, ENOMEM, + "Can't allocate data for xz decompression"); + free(out_block); + free(state); + return (ARCHIVE_FATAL); + } + + self->data = state; + state->out_block_size = out_block_size; + state->out_block = out_block; + self->read = xz_filter_read; + self->skip = NULL; /* not supported */ + self->close = xz_filter_close; + + state->stream.avail_in = 0; + + state->stream.next_out = state->out_block; + state->stream.avail_out = state->out_block_size; + + /* Initialize compression library. + * TODO: I don't know what value is best for memlimit. + * maybe, it needs to check memory size which + * running system has. + */ + if (self->code == ARCHIVE_COMPRESSION_XZ) + ret = lzma_stream_decoder(&(state->stream), + (1U << 30),/* memlimit */ + LZMA_CONCATENATED); + else + ret = lzma_alone_decoder(&(state->stream), + (1U << 30));/* memlimit */ + + if (ret == LZMA_OK) + return (ARCHIVE_OK); + + /* Library setup failed: Choose an error message and clean up. */ + switch (ret) { + case LZMA_MEM_ERROR: + archive_set_error(&self->archive->archive, ENOMEM, + "Internal error initializing compression library: " + "Cannot allocate memory"); + break; + case LZMA_OPTIONS_ERROR: + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "Internal error initializing compression library: " + "Invalid or unsupported options"); + break; + default: + archive_set_error(&self->archive->archive, ARCHIVE_ERRNO_MISC, + "Internal error initializing lzma library"); + break; + } + + free(state->out_block); + free(state); + self->data = NULL; + return (ARCHIVE_FATAL); +} + +/* + * Return the next block of decompressed data. + */ +static ssize_t +xz_filter_read(struct archive_read_filter *self, const void **p) +{ + struct private_data *state; + size_t decompressed; + ssize_t avail_in; + int ret; + + state = (struct private_data *)self->data; + + /* Empty our output buffer. */ + state->stream.next_out = state->out_block; + state->stream.avail_out = state->out_block_size; + + /* Try to fill the output buffer. */ + while (state->stream.avail_out > 0 && !state->eof) { + state->stream.next_in = + __archive_read_filter_ahead(self->upstream, 1, &avail_in); + if (state->stream.next_in == NULL && avail_in < 0) + return (ARCHIVE_FATAL); + state->stream.avail_in = avail_in; + + /* Decompress as much as we can in one pass. */ + ret = lzma_code(&(state->stream), + (state->stream.avail_in == 0)? LZMA_FINISH: LZMA_RUN); + switch (ret) { + case LZMA_STREAM_END: /* Found end of stream. */ + state->eof = 1; + /* FALL THROUGH */ + case LZMA_OK: /* Decompressor made some progress. */ + __archive_read_filter_consume(self->upstream, + avail_in - state->stream.avail_in); + break; + case LZMA_MEM_ERROR: + archive_set_error(&self->archive->archive, ENOMEM, + "Lzma library error: Cannot allocate memory"); + return (ARCHIVE_FATAL); + case LZMA_MEMLIMIT_ERROR: + archive_set_error(&self->archive->archive, ENOMEM, + "Lzma library error: Out of memory"); + return (ARCHIVE_FATAL); + case LZMA_FORMAT_ERROR: + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "Lzma library error: format not recognized"); + return (ARCHIVE_FATAL); + case LZMA_OPTIONS_ERROR: + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "Lzma library error: Invalid options"); + return (ARCHIVE_FATAL); + case LZMA_DATA_ERROR: + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "Lzma library error: Corrupted input data"); + return (ARCHIVE_FATAL); + case LZMA_BUF_ERROR: + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "Lzma library error: No progress is possible"); + return (ARCHIVE_FATAL); + default: + /* Return an error. */ + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "Lzma decompression failed: Unknown error"); + return (ARCHIVE_FATAL); + } + } + + decompressed = state->stream.next_out - state->out_block; + state->total_out += decompressed; + if (decompressed == 0) + *p = NULL; + else + *p = state->out_block; + return (decompressed); +} + +/* + * Clean up the decompressor. + */ +static int +xz_filter_close(struct archive_read_filter *self) +{ + struct private_data *state; + + state = (struct private_data *)self->data; + lzma_end(&(state->stream)); + free(state->out_block); + free(state); + return (ARCHIVE_OK); +} + +#else + +#if HAVE_LZMADEC_H && HAVE_LIBLZMADEC + +/* + * If we have the older liblzmadec library, then we can handle + * LZMA streams but not XZ streams. + */ + +/* + * Setup the callbacks. + */ +static int +lzma_bidder_init(struct archive_read_filter *self) +{ + static const size_t out_block_size = 64 * 1024; + void *out_block; + struct private_data *state; + ssize_t ret, avail_in; + + self->code = ARCHIVE_COMPRESSION_LZMA; + self->name = "lzma"; + + state = (struct private_data *)calloc(sizeof(*state), 1); + out_block = (unsigned char *)malloc(out_block_size); + if (state == NULL || out_block == NULL) { + archive_set_error(&self->archive->archive, ENOMEM, + "Can't allocate data for lzma decompression"); + free(out_block); + free(state); + return (ARCHIVE_FATAL); + } + + self->data = state; + state->out_block_size = out_block_size; + state->out_block = out_block; + self->read = lzma_filter_read; + self->skip = NULL; /* not supported */ + self->close = lzma_filter_close; + + /* Prime the lzma library with 18 bytes of input. */ + state->stream.next_in = (unsigned char *)(uintptr_t) + __archive_read_filter_ahead(self->upstream, 18, &avail_in); + if (state->stream.next_in == NULL) + return (ARCHIVE_FATAL); + state->stream.avail_in = avail_in; + state->stream.next_out = state->out_block; + state->stream.avail_out = state->out_block_size; + + /* Initialize compression library. */ + ret = lzmadec_init(&(state->stream)); + __archive_read_filter_consume(self->upstream, + avail_in - state->stream.avail_in); + if (ret == LZMADEC_OK) + return (ARCHIVE_OK); + + /* Library setup failed: Clean up. */ + archive_set_error(&self->archive->archive, ARCHIVE_ERRNO_MISC, + "Internal error initializing lzma library"); + + /* Override the error message if we know what really went wrong. */ + switch (ret) { + case LZMADEC_HEADER_ERROR: + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "Internal error initializing compression library: " + "invalid header"); + break; + case LZMADEC_MEM_ERROR: + archive_set_error(&self->archive->archive, ENOMEM, + "Internal error initializing compression library: " + "out of memory"); + break; + } + + free(state->out_block); + free(state); + self->data = NULL; + return (ARCHIVE_FATAL); +} + +/* + * Return the next block of decompressed data. + */ +static ssize_t +lzma_filter_read(struct archive_read_filter *self, const void **p) +{ + struct private_data *state; + size_t decompressed; + ssize_t avail_in, ret; + + state = (struct private_data *)self->data; + + /* Empty our output buffer. */ + state->stream.next_out = state->out_block; + state->stream.avail_out = state->out_block_size; + + /* Try to fill the output buffer. */ + while (state->stream.avail_out > 0 && !state->eof) { + state->stream.next_in = (unsigned char *)(uintptr_t) + __archive_read_filter_ahead(self->upstream, 1, &avail_in); + if (state->stream.next_in == NULL && avail_in < 0) + return (ARCHIVE_FATAL); + state->stream.avail_in = avail_in; + + /* Decompress as much as we can in one pass. */ + ret = lzmadec_decode(&(state->stream), avail_in == 0); + switch (ret) { + case LZMADEC_STREAM_END: /* Found end of stream. */ + state->eof = 1; + /* FALL THROUGH */ + case LZMADEC_OK: /* Decompressor made some progress. */ + __archive_read_filter_consume(self->upstream, + avail_in - state->stream.avail_in); + break; + case LZMADEC_BUF_ERROR: /* Insufficient input data? */ + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "Insufficient compressed data"); + return (ARCHIVE_FATAL); + default: + /* Return an error. */ + archive_set_error(&self->archive->archive, + ARCHIVE_ERRNO_MISC, + "Lzma decompression failed"); + return (ARCHIVE_FATAL); + } + } + + decompressed = state->stream.next_out - state->out_block; + state->total_out += decompressed; + if (decompressed == 0) + *p = NULL; + else + *p = state->out_block; + return (decompressed); +} + +/* + * Clean up the decompressor. + */ +static int +lzma_filter_close(struct archive_read_filter *self) +{ + struct private_data *state; + int ret; + + state = (struct private_data *)self->data; + ret = ARCHIVE_OK; + switch (lzmadec_end(&(state->stream))) { + case LZMADEC_OK: + break; + default: + archive_set_error(&(self->archive->archive), + ARCHIVE_ERRNO_MISC, + "Failed to clean up %s compressor", + self->archive->archive.compression_name); + ret = ARCHIVE_FATAL; + } + + free(state->out_block); + free(state); + return (ret); +} + +#else + +/* + * + * If we have no suitable library on this system, we can't actually do + * the decompression. We can, however, still detect compressed + * archives and emit a useful message. + * + */ +static int +lzma_bidder_init(struct archive_read_filter *self) +{ + int r; + + r = __archive_read_program(self, "unlzma"); + /* Note: We set the format here even if __archive_read_program() + * above fails. We do, after all, know what the format is + * even if we weren't able to read it. */ + self->code = ARCHIVE_COMPRESSION_LZMA; + self->name = "lzma"; + return (r); +} + +#endif /* HAVE_LZMADEC_H */ + + +static int +xz_bidder_init(struct archive_read_filter *self) +{ + int r; + + r = __archive_read_program(self, "unxz"); + /* Note: We set the format here even if __archive_read_program() + * above fails. We do, after all, know what the format is + * even if we weren't able to read it. */ + self->code = ARCHIVE_COMPRESSION_XZ; + self->name = "xz"; + return (r); +} + + +#endif /* HAVE_LZMA_H */ diff --git a/lib/libarchive/archive_read_support_format_all.c b/lib/libarchive/archive_read_support_format_all.c new file mode 100644 index 000000000..fdd52fb07 --- /dev/null +++ b/lib/libarchive/archive_read_support_format_all.c @@ -0,0 +1,45 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_support_format_all.c 174991 2007-12-30 04:58:22Z kientzle $"); + +#include "archive.h" + +int +archive_read_support_format_all(struct archive *a) +{ + archive_read_support_format_ar(a); + archive_read_support_format_empty(a); +#ifndef __minix + archive_read_support_format_iso9660(a); + archive_read_support_format_cpio(a); +#endif + archive_read_support_format_mtree(a); + archive_read_support_format_tar(a); + archive_read_support_format_xar(a); + archive_read_support_format_zip(a); + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_read_support_format_ar.c b/lib/libarchive/archive_read_support_format_ar.c new file mode 100644 index 000000000..84eabb10e --- /dev/null +++ b/lib/libarchive/archive_read_support_format_ar.c @@ -0,0 +1,679 @@ +/*- + * Copyright (c) 2007 Kai Wang + * Copyright (c) 2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_support_format_ar.c 201101 2009-12-28 03:06:27Z kientzle $"); + +#ifdef HAVE_SYS_STAT_H +#include +#endif +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#ifdef HAVE_LIMITS_H +#include +#endif + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" +#include "archive_read_private.h" + +struct ar { + off_t entry_bytes_remaining; + off_t entry_offset; + off_t entry_padding; + char *strtab; + size_t strtab_size; +}; + +/* + * Define structure of the "ar" header. + */ +#define AR_name_offset 0 +#define AR_name_size 16 +#define AR_date_offset 16 +#define AR_date_size 12 +#define AR_uid_offset 28 +#define AR_uid_size 6 +#define AR_gid_offset 34 +#define AR_gid_size 6 +#define AR_mode_offset 40 +#define AR_mode_size 8 +#define AR_size_offset 48 +#define AR_size_size 10 +#define AR_fmag_offset 58 +#define AR_fmag_size 2 + +static int archive_read_format_ar_bid(struct archive_read *a); +static int archive_read_format_ar_cleanup(struct archive_read *a); +static int archive_read_format_ar_read_data(struct archive_read *a, + const void **buff, size_t *size, off_t *offset); +static int archive_read_format_ar_skip(struct archive_read *a); +static int archive_read_format_ar_read_header(struct archive_read *a, + struct archive_entry *e); +#ifndef __minix +static uint64_t ar_atol8(const char *p, unsigned char_cnt); +static uint64_t ar_atol10(const char *p, unsigned char_cnt); +#else +static uint32_t ar_atol8(const char *p, unsigned char_cnt); +static uint32_t ar_atol10(const char *p, unsigned char_cnt); +#endif + +static int ar_parse_gnu_filename_table(struct archive_read *a); +static int ar_parse_common_header(struct ar *ar, struct archive_entry *, + const char *h); + +int +archive_read_support_format_ar(struct archive *_a) +{ + struct archive_read *a = (struct archive_read *)_a; + struct ar *ar; + int r; + + ar = (struct ar *)malloc(sizeof(*ar)); + if (ar == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate ar data"); + return (ARCHIVE_FATAL); + } + memset(ar, 0, sizeof(*ar)); + ar->strtab = NULL; + + r = __archive_read_register_format(a, + ar, + "ar", + archive_read_format_ar_bid, + NULL, + archive_read_format_ar_read_header, + archive_read_format_ar_read_data, + archive_read_format_ar_skip, + archive_read_format_ar_cleanup); + + if (r != ARCHIVE_OK) { + free(ar); + return (r); + } + return (ARCHIVE_OK); +} + +static int +archive_read_format_ar_cleanup(struct archive_read *a) +{ + struct ar *ar; + + ar = (struct ar *)(a->format->data); + if (ar->strtab) + free(ar->strtab); + free(ar); + (a->format->data) = NULL; + return (ARCHIVE_OK); +} + +static int +archive_read_format_ar_bid(struct archive_read *a) +{ + const void *h; + + if (a->archive.archive_format != 0 && + (a->archive.archive_format & ARCHIVE_FORMAT_BASE_MASK) != + ARCHIVE_FORMAT_AR) + return(0); + + /* + * Verify the 8-byte file signature. + * TODO: Do we need to check more than this? + */ + if ((h = __archive_read_ahead(a, 8, NULL)) == NULL) + return (-1); + if (strncmp((const char*)h, "!\n", 8) == 0) { + return (64); + } + return (-1); +} + +static int +archive_read_format_ar_read_header(struct archive_read *a, + struct archive_entry *entry) +{ + char filename[AR_name_size + 1]; + struct ar *ar; +#ifndef __minix + uint64_t number; /* Used to hold parsed numbers before validation. */ +#else + uint32_t number; /* Used to hold parsed numbers before validation. */ +#endif + ssize_t bytes_read; + size_t bsd_name_length, entry_size; + char *p, *st; + const void *b; + const char *h; + int r; + + ar = (struct ar*)(a->format->data); + + if (a->archive.file_position == 0) { + /* + * We are now at the beginning of the archive, + * so we need first consume the ar global header. + */ + __archive_read_consume(a, 8); + /* Set a default format code for now. */ + a->archive.archive_format = ARCHIVE_FORMAT_AR; + } + + /* Read the header for the next file entry. */ + if ((b = __archive_read_ahead(a, 60, &bytes_read)) == NULL) + /* Broken header. */ + return (ARCHIVE_EOF); + __archive_read_consume(a, 60); + h = (const char *)b; + + /* Verify the magic signature on the file header. */ + if (strncmp(h + AR_fmag_offset, "`\n", 2) != 0) { + archive_set_error(&a->archive, EINVAL, + "Incorrect file header signature"); + return (ARCHIVE_WARN); + } + + /* Copy filename into work buffer. */ + strncpy(filename, h + AR_name_offset, AR_name_size); + filename[AR_name_size] = '\0'; + + /* + * Guess the format variant based on the filename. + */ + if (a->archive.archive_format == ARCHIVE_FORMAT_AR) { + /* We don't already know the variant, so let's guess. */ + /* + * Biggest clue is presence of '/': GNU starts special + * filenames with '/', appends '/' as terminator to + * non-special names, so anything with '/' should be + * GNU except for BSD long filenames. + */ + if (strncmp(filename, "#1/", 3) == 0) + a->archive.archive_format = ARCHIVE_FORMAT_AR_BSD; + else if (strchr(filename, '/') != NULL) + a->archive.archive_format = ARCHIVE_FORMAT_AR_GNU; + else if (strncmp(filename, "__.SYMDEF", 9) == 0) + a->archive.archive_format = ARCHIVE_FORMAT_AR_BSD; + /* + * XXX Do GNU/SVR4 'ar' programs ever omit trailing '/' + * if name exactly fills 16-byte field? If so, we + * can't assume entries without '/' are BSD. XXX + */ + } + + /* Update format name from the code. */ + if (a->archive.archive_format == ARCHIVE_FORMAT_AR_GNU) + a->archive.archive_format_name = "ar (GNU/SVR4)"; + else if (a->archive.archive_format == ARCHIVE_FORMAT_AR_BSD) + a->archive.archive_format_name = "ar (BSD)"; + else + a->archive.archive_format_name = "ar"; + + /* + * Remove trailing spaces from the filename. GNU and BSD + * variants both pad filename area out with spaces. + * This will only be wrong if GNU/SVR4 'ar' implementations + * omit trailing '/' for 16-char filenames and we have + * a 16-char filename that ends in ' '. + */ + p = filename + AR_name_size - 1; + while (p >= filename && *p == ' ') { + *p = '\0'; + p--; + } + + /* + * Remove trailing slash unless first character is '/'. + * (BSD entries never end in '/', so this will only trim + * GNU-format entries. GNU special entries start with '/' + * and are not terminated in '/', so we don't trim anything + * that starts with '/'.) + */ + if (filename[0] != '/' && *p == '/') + *p = '\0'; + + /* + * '//' is the GNU filename table. + * Later entries can refer to names in this table. + */ + if (strcmp(filename, "//") == 0) { + /* This must come before any call to _read_ahead. */ + ar_parse_common_header(ar, entry, h); + archive_entry_copy_pathname(entry, filename); + archive_entry_set_filetype(entry, AE_IFREG); + /* Get the size of the filename table. */ + number = ar_atol10(h + AR_size_offset, AR_size_size); +#ifndef __minix + if (number > SIZE_MAX) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Filename table too large"); + return (ARCHIVE_FATAL); + } +#else + /* The above won't work for us as UINT32_MAX == SIZE_MAX on Minix + * We simply decrease the maximum allowed filename table size + * to SIZE_MAX - 1 + */ + if (number == SIZE_MAX) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Filename table too large"); + return (ARCHIVE_FATAL); + } +#endif + entry_size = (size_t)number; + if (entry_size == 0) { + archive_set_error(&a->archive, EINVAL, + "Invalid string table"); + return (ARCHIVE_WARN); + } + if (ar->strtab != NULL) { + archive_set_error(&a->archive, EINVAL, + "More than one string tables exist"); + return (ARCHIVE_WARN); + } + + /* Read the filename table into memory. */ + st = malloc(entry_size); + if (st == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate filename table buffer"); + return (ARCHIVE_FATAL); + } + ar->strtab = st; + ar->strtab_size = entry_size; + if ((b = __archive_read_ahead(a, entry_size, NULL)) == NULL) + return (ARCHIVE_FATAL); + memcpy(st, b, entry_size); + __archive_read_consume(a, entry_size); + /* All contents are consumed. */ + ar->entry_bytes_remaining = 0; + archive_entry_set_size(entry, ar->entry_bytes_remaining); + + /* Parse the filename table. */ + return (ar_parse_gnu_filename_table(a)); + } + + /* + * GNU variant handles long filenames by storing / + * to indicate a name stored in the filename table. + * XXX TODO: Verify that it's all digits... Don't be fooled + * by "/9xyz" XXX + */ + if (filename[0] == '/' && filename[1] >= '0' && filename[1] <= '9') { + number = ar_atol10(h + AR_name_offset + 1, AR_name_size - 1); + /* + * If we can't look up the real name, warn and return + * the entry with the wrong name. + */ + if (ar->strtab == NULL || number > ar->strtab_size) { + archive_set_error(&a->archive, EINVAL, + "Can't find long filename for entry"); + archive_entry_copy_pathname(entry, filename); + /* Parse the time, owner, mode, size fields. */ + ar_parse_common_header(ar, entry, h); + return (ARCHIVE_WARN); + } + + archive_entry_copy_pathname(entry, &ar->strtab[(size_t)number]); + /* Parse the time, owner, mode, size fields. */ + return (ar_parse_common_header(ar, entry, h)); + } + + /* + * BSD handles long filenames by storing "#1/" followed by the + * length of filename as a decimal number, then prepends the + * the filename to the file contents. + */ + if (strncmp(filename, "#1/", 3) == 0) { + /* Parse the time, owner, mode, size fields. */ + /* This must occur before _read_ahead is called again. */ + ar_parse_common_header(ar, entry, h); + + /* Parse the size of the name, adjust the file size. */ + number = ar_atol10(h + AR_name_offset + 3, AR_name_size - 3); + bsd_name_length = (size_t)number; + /* Guard against the filename + trailing NUL + * overflowing a size_t and against the filename size + * being larger than the entire entry. */ +#ifndef __minix + if (number > (uint64_t)(bsd_name_length + 1) + || (off_t)bsd_name_length > ar->entry_bytes_remaining) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Bad input file size"); + return (ARCHIVE_FATAL); + } +#else + /* The above way won't work for us as we use uint32_t for number + * and not uint64_t. We decrease the maximum allowed name + * length to UINT32_MAX - 1 (which is what ar_atol10 will return + * in case of an overflow). + */ + if (number == UINT32_MAX + || (off_t)bsd_name_length > ar->entry_bytes_remaining) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Bad input file size"); + return (ARCHIVE_FATAL); + } +#endif + ar->entry_bytes_remaining -= bsd_name_length; + /* Adjust file size reported to client. */ + archive_entry_set_size(entry, ar->entry_bytes_remaining); + + /* Read the long name into memory. */ + if ((b = __archive_read_ahead(a, bsd_name_length, NULL)) == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Truncated input file"); + return (ARCHIVE_FATAL); + } + __archive_read_consume(a, bsd_name_length); + + /* Store it in the entry. */ + p = (char *)malloc(bsd_name_length + 1); + if (p == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate fname buffer"); + return (ARCHIVE_FATAL); + } + strncpy(p, b, bsd_name_length); + p[bsd_name_length] = '\0'; + archive_entry_copy_pathname(entry, p); + free(p); + return (ARCHIVE_OK); + } + + /* + * "/" is the SVR4/GNU archive symbol table. + */ + if (strcmp(filename, "/") == 0) { + archive_entry_copy_pathname(entry, "/"); + /* Parse the time, owner, mode, size fields. */ + r = ar_parse_common_header(ar, entry, h); + /* Force the file type to a regular file. */ + archive_entry_set_filetype(entry, AE_IFREG); + return (r); + } + + /* + * "__.SYMDEF" is a BSD archive symbol table. + */ + if (strcmp(filename, "__.SYMDEF") == 0) { + archive_entry_copy_pathname(entry, filename); + /* Parse the time, owner, mode, size fields. */ + return (ar_parse_common_header(ar, entry, h)); + } + + /* + * Otherwise, this is a standard entry. The filename + * has already been trimmed as much as possible, based + * on our current knowledge of the format. + */ + archive_entry_copy_pathname(entry, filename); + return (ar_parse_common_header(ar, entry, h)); +} + +static int +ar_parse_common_header(struct ar *ar, struct archive_entry *entry, + const char *h) +{ +#ifndef __minix + uint64_t n; +#else + uint32_t n; +#endif + + /* Copy remaining header */ + archive_entry_set_mtime(entry, + (time_t)ar_atol10(h + AR_date_offset, AR_date_size), 0L); + archive_entry_set_uid(entry, + (uid_t)ar_atol10(h + AR_uid_offset, AR_uid_size)); + archive_entry_set_gid(entry, + (gid_t)ar_atol10(h + AR_gid_offset, AR_gid_size)); + archive_entry_set_mode(entry, + (mode_t)ar_atol8(h + AR_mode_offset, AR_mode_size)); + n = ar_atol10(h + AR_size_offset, AR_size_size); + + ar->entry_offset = 0; + ar->entry_padding = n % 2; + archive_entry_set_size(entry, n); + ar->entry_bytes_remaining = n; + return (ARCHIVE_OK); +} + +static int +archive_read_format_ar_read_data(struct archive_read *a, + const void **buff, size_t *size, off_t *offset) +{ + ssize_t bytes_read; + struct ar *ar; + + ar = (struct ar *)(a->format->data); + + if (ar->entry_bytes_remaining > 0) { + *buff = __archive_read_ahead(a, 1, &bytes_read); + if (bytes_read == 0) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Truncated ar archive"); + return (ARCHIVE_FATAL); + } + if (bytes_read < 0) + return (ARCHIVE_FATAL); + if (bytes_read > ar->entry_bytes_remaining) + bytes_read = (ssize_t)ar->entry_bytes_remaining; + *size = bytes_read; + *offset = ar->entry_offset; + ar->entry_offset += bytes_read; + ar->entry_bytes_remaining -= bytes_read; + __archive_read_consume(a, (size_t)bytes_read); + return (ARCHIVE_OK); + } else { + while (ar->entry_padding > 0) { + *buff = __archive_read_ahead(a, 1, &bytes_read); + if (bytes_read <= 0) + return (ARCHIVE_FATAL); + if (bytes_read > ar->entry_padding) + bytes_read = (ssize_t)ar->entry_padding; + __archive_read_consume(a, (size_t)bytes_read); + ar->entry_padding -= bytes_read; + } + *buff = NULL; + *size = 0; + *offset = ar->entry_offset; + return (ARCHIVE_EOF); + } +} + +static int +archive_read_format_ar_skip(struct archive_read *a) +{ + off_t bytes_skipped; + struct ar* ar; + + ar = (struct ar *)(a->format->data); + + bytes_skipped = __archive_read_skip(a, + ar->entry_bytes_remaining + ar->entry_padding); + if (bytes_skipped < 0) + return (ARCHIVE_FATAL); + + ar->entry_bytes_remaining = 0; + ar->entry_padding = 0; + + return (ARCHIVE_OK); +} + +static int +ar_parse_gnu_filename_table(struct archive_read *a) +{ + struct ar *ar; + char *p; + size_t size; + + ar = (struct ar*)(a->format->data); + size = ar->strtab_size; + + for (p = ar->strtab; p < ar->strtab + size - 1; ++p) { + if (*p == '/') { + *p++ = '\0'; + if (*p != '\n') + goto bad_string_table; + *p = '\0'; + } + } + /* + * GNU ar always pads the table to an even size. + * The pad character is either '\n' or '`'. + */ + if (p != ar->strtab + size && *p != '\n' && *p != '`') + goto bad_string_table; + + /* Enforce zero termination. */ + ar->strtab[size - 1] = '\0'; + + return (ARCHIVE_OK); + +bad_string_table: + archive_set_error(&a->archive, EINVAL, + "Invalid string table"); + free(ar->strtab); + ar->strtab = NULL; + return (ARCHIVE_WARN); +} + +#ifndef __minix +static uint64_t +ar_atol8(const char *p, unsigned char_cnt) +{ + uint64_t l, limit, last_digit_limit; + unsigned int digit, base; + + base = 8; + limit = UINT64_MAX / base; + last_digit_limit = UINT64_MAX % base; + + while ((*p == ' ' || *p == '\t') && char_cnt-- > 0) + p++; + + l = 0; + digit = *p - '0'; + while (*p >= '0' && digit < base && char_cnt-- > 0) { + if (l>limit || (l == limit && digit > last_digit_limit)) { + l = UINT64_MAX; /* Truncate on overflow. */ + break; + } + l = (l * base) + digit; + digit = *++p - '0'; + } + return (l); +} +#else +static uint32_t +ar_atol8(const char *p, unsigned char_cnt) +{ + uint32_t l, limit, last_digit_limit; + unsigned int digit, base; + + base = 8; + limit = UINT32_MAX / base; + last_digit_limit = UINT32_MAX % base; + + while ((*p == ' ' || *p == '\t') && char_cnt-- > 0) + p++; + + l = 0; + digit = *p - '0'; + while (*p >= '0' && digit < base && char_cnt-- > 0) { + if (l>limit || (l == limit && digit > last_digit_limit)) { + l = UINT32_MAX; /* Truncate on overflow. */ + break; + } + l = (l * base) + digit; + digit = *++p - '0'; + } + return (l); +} +#endif + +#ifndef __minix +static uint64_t +ar_atol10(const char *p, unsigned char_cnt) +{ + uint64_t l, limit, last_digit_limit; + unsigned int base, digit; + + base = 10; + limit = UINT64_MAX / base; + last_digit_limit = UINT64_MAX % base; + + while ((*p == ' ' || *p == '\t') && char_cnt-- > 0) + p++; + l = 0; + digit = *p - '0'; + while (*p >= '0' && digit < base && char_cnt-- > 0) { + if (l > limit || (l == limit && digit > last_digit_limit)) { + l = UINT64_MAX; /* Truncate on overflow. */ + break; + } + l = (l * base) + digit; + digit = *++p - '0'; + } + return (l); +} +#else +static uint32_t +ar_atol10(const char *p, unsigned char_cnt) +{ + uint32_t l, limit, last_digit_limit; + unsigned int base, digit; + + base = 10; + limit = UINT32_MAX / base; + last_digit_limit = UINT32_MAX % base; + + while ((*p == ' ' || *p == '\t') && char_cnt-- > 0) + p++; + l = 0; + digit = *p - '0'; + while (*p >= '0' && digit < base && char_cnt-- > 0) { + if (l > limit || (l == limit && digit > last_digit_limit)) { + l = UINT32_MAX; /* Truncate on overflow. */ + break; + } + l = (l * base) + digit; + digit = *++p - '0'; + } + return (l); +} +#endif diff --git a/lib/libarchive/archive_read_support_format_empty.c b/lib/libarchive/archive_read_support_format_empty.c new file mode 100644 index 000000000..518fdcb49 --- /dev/null +++ b/lib/libarchive/archive_read_support_format_empty.c @@ -0,0 +1,93 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_support_format_empty.c 191524 2009-04-26 18:24:14Z kientzle $"); + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" +#include "archive_read_private.h" + +static int archive_read_format_empty_bid(struct archive_read *); +static int archive_read_format_empty_read_data(struct archive_read *, + const void **, size_t *, off_t *); +static int archive_read_format_empty_read_header(struct archive_read *, + struct archive_entry *); +int +archive_read_support_format_empty(struct archive *_a) +{ + struct archive_read *a = (struct archive_read *)_a; + int r; + + r = __archive_read_register_format(a, + NULL, + NULL, + archive_read_format_empty_bid, + NULL, + archive_read_format_empty_read_header, + archive_read_format_empty_read_data, + NULL, + NULL); + + return (r); +} + + +static int +archive_read_format_empty_bid(struct archive_read *a) +{ + ssize_t avail; + + (void)__archive_read_ahead(a, 1, &avail); + if (avail != 0) + return (-1); + return (1); +} + +static int +archive_read_format_empty_read_header(struct archive_read *a, + struct archive_entry *entry) +{ + (void)a; /* UNUSED */ + (void)entry; /* UNUSED */ + + a->archive.archive_format = ARCHIVE_FORMAT_EMPTY; + a->archive.archive_format_name = "Empty file"; + + return (ARCHIVE_EOF); +} + +static int +archive_read_format_empty_read_data(struct archive_read *a, + const void **buff, size_t *size, off_t *offset) +{ + (void)a; /* UNUSED */ + (void)buff; /* UNUSED */ + (void)size; /* UNUSED */ + (void)offset; /* UNUSED */ + + return (ARCHIVE_EOF); +} diff --git a/lib/libarchive/archive_read_support_format_iso9660.c b/lib/libarchive/archive_read_support_format_iso9660.c new file mode 100644 index 000000000..0c640c88e --- /dev/null +++ b/lib/libarchive/archive_read_support_format_iso9660.c @@ -0,0 +1,2830 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * Copyright (c) 2009 Andreas Henriksson + * Copyright (c) 2009 Michihiro NAKAJIMA + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_support_format_iso9660.c 201246 2009-12-30 05:30:35Z kientzle $"); + +#ifdef HAVE_ERRNO_H +#include +#endif +/* #include */ /* See archive_platform.h */ +#include +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#include +#ifdef HAVE_ZLIB_H +#include +#endif + +#include "archive.h" +#include "archive_endian.h" +#include "archive_entry.h" +#include "archive_private.h" +#include "archive_read_private.h" +#include "archive_string.h" + +/* + * An overview of ISO 9660 format: + * + * Each disk is laid out as follows: + * * 32k reserved for private use + * * Volume descriptor table. Each volume descriptor + * is 2k and specifies basic format information. + * The "Primary Volume Descriptor" (PVD) is defined by the + * standard and should always be present; other volume + * descriptors include various vendor-specific extensions. + * * Files and directories. Each file/dir is specified by + * an "extent" (starting sector and length in bytes). + * Dirs are just files with directory records packed one + * after another. The PVD contains a single dir entry + * specifying the location of the root directory. Everything + * else follows from there. + * + * This module works by first reading the volume descriptors, then + * building a list of directory entries, sorted by starting + * sector. At each step, I look for the earliest dir entry that + * hasn't yet been read, seek forward to that location and read + * that entry. If it's a dir, I slurp in the new dir entries and + * add them to the heap; if it's a regular file, I return the + * corresponding archive_entry and wait for the client to request + * the file body. This strategy allows us to read most compliant + * CDs with a single pass through the data, as required by libarchive. + */ +#define LOGICAL_BLOCK_SIZE 2048 +#define SYSTEM_AREA_BLOCK 16 + +/* Structure of on-disk primary volume descriptor. */ +#define PVD_type_offset 0 +#define PVD_type_size 1 +#define PVD_id_offset (PVD_type_offset + PVD_type_size) +#define PVD_id_size 5 +#define PVD_version_offset (PVD_id_offset + PVD_id_size) +#define PVD_version_size 1 +#define PVD_reserved1_offset (PVD_version_offset + PVD_version_size) +#define PVD_reserved1_size 1 +#define PVD_system_id_offset (PVD_reserved1_offset + PVD_reserved1_size) +#define PVD_system_id_size 32 +#define PVD_volume_id_offset (PVD_system_id_offset + PVD_system_id_size) +#define PVD_volume_id_size 32 +#define PVD_reserved2_offset (PVD_volume_id_offset + PVD_volume_id_size) +#define PVD_reserved2_size 8 +#define PVD_volume_space_size_offset (PVD_reserved2_offset + PVD_reserved2_size) +#define PVD_volume_space_size_size 8 +#define PVD_reserved3_offset (PVD_volume_space_size_offset + PVD_volume_space_size_size) +#define PVD_reserved3_size 32 +#define PVD_volume_set_size_offset (PVD_reserved3_offset + PVD_reserved3_size) +#define PVD_volume_set_size_size 4 +#define PVD_volume_sequence_number_offset (PVD_volume_set_size_offset + PVD_volume_set_size_size) +#define PVD_volume_sequence_number_size 4 +#define PVD_logical_block_size_offset (PVD_volume_sequence_number_offset + PVD_volume_sequence_number_size) +#define PVD_logical_block_size_size 4 +#define PVD_path_table_size_offset (PVD_logical_block_size_offset + PVD_logical_block_size_size) +#define PVD_path_table_size_size 8 +#define PVD_type_1_path_table_offset (PVD_path_table_size_offset + PVD_path_table_size_size) +#define PVD_type_1_path_table_size 4 +#define PVD_opt_type_1_path_table_offset (PVD_type_1_path_table_offset + PVD_type_1_path_table_size) +#define PVD_opt_type_1_path_table_size 4 +#define PVD_type_m_path_table_offset (PVD_opt_type_1_path_table_offset + PVD_opt_type_1_path_table_size) +#define PVD_type_m_path_table_size 4 +#define PVD_opt_type_m_path_table_offset (PVD_type_m_path_table_offset + PVD_type_m_path_table_size) +#define PVD_opt_type_m_path_table_size 4 +#define PVD_root_directory_record_offset (PVD_opt_type_m_path_table_offset + PVD_opt_type_m_path_table_size) +#define PVD_root_directory_record_size 34 +#define PVD_volume_set_id_offset (PVD_root_directory_record_offset + PVD_root_directory_record_size) +#define PVD_volume_set_id_size 128 +#define PVD_publisher_id_offset (PVD_volume_set_id_offset + PVD_volume_set_id_size) +#define PVD_publisher_id_size 128 +#define PVD_preparer_id_offset (PVD_publisher_id_offset + PVD_publisher_id_size) +#define PVD_preparer_id_size 128 +#define PVD_application_id_offset (PVD_preparer_id_offset + PVD_preparer_id_size) +#define PVD_application_id_size 128 +#define PVD_copyright_file_id_offset (PVD_application_id_offset + PVD_application_id_size) +#define PVD_copyright_file_id_size 37 +#define PVD_abstract_file_id_offset (PVD_copyright_file_id_offset + PVD_copyright_file_id_size) +#define PVD_abstract_file_id_size 37 +#define PVD_bibliographic_file_id_offset (PVD_abstract_file_id_offset + PVD_abstract_file_id_size) +#define PVD_bibliographic_file_id_size 37 +#define PVD_creation_date_offset (PVD_bibliographic_file_id_offset + PVD_bibliographic_file_id_size) +#define PVD_creation_date_size 17 +#define PVD_modification_date_offset (PVD_creation_date_offset + PVD_creation_date_size) +#define PVD_modification_date_size 17 +#define PVD_expiration_date_offset (PVD_modification_date_offset + PVD_modification_date_size) +#define PVD_expiration_date_size 17 +#define PVD_effective_date_offset (PVD_expiration_date_offset + PVD_expiration_date_size) +#define PVD_effective_date_size 17 +#define PVD_file_structure_version_offset (PVD_effective_date_offset + PVD_effective_date_size) +#define PVD_file_structure_version_size 1 +#define PVD_reserved4_offset (PVD_file_structure_version_offset + PVD_file_structure_version_size) +#define PVD_reserved4_size 1 +#define PVD_application_data_offset (PVD_reserved4_offset + PVD_reserved4_size) +#define PVD_application_data_size 512 +#define PVD_reserved5_offset (PVD_application_data_offset + PVD_application_data_size) +#define PVD_reserved5_size (2048 - PVD_reserved5_offset) + +/* TODO: It would make future maintenance easier to just hardcode the + * above values. In particular, ECMA119 states the offsets as part of + * the standard. That would eliminate the need for the following check.*/ +#if PVD_reserved5_offset != 1395 +#error PVD offset and size definitions are wrong. +#endif + + +/* Structure of optional on-disk supplementary volume descriptor. */ +#define SVD_type_offset 0 +#define SVD_type_size 1 +#define SVD_id_offset (SVD_type_offset + SVD_type_size) +#define SVD_id_size 5 +#define SVD_version_offset (SVD_id_offset + SVD_id_size) +#define SVD_version_size 1 +/* ... */ +#define SVD_reserved1_offset 72 +#define SVD_reserved1_size 8 +#define SVD_volume_space_size_offset 80 +#define SVD_volume_space_size_size 8 +#define SVD_escape_sequences_offset (SVD_volume_space_size_offset + SVD_volume_space_size_size) +#define SVD_escape_sequences_size 32 +/* ... */ +#define SVD_logical_block_size_offset 128 +#define SVD_logical_block_size_size 4 +#define SVD_type_L_path_table_offset 140 +#define SVD_type_M_path_table_offset 148 +/* ... */ +#define SVD_root_directory_record_offset 156 +#define SVD_root_directory_record_size 34 +#define SVD_file_structure_version_offset 881 +#define SVD_reserved2_offset 882 +#define SVD_reserved2_size 1 +#define SVD_reserved3_offset 1395 +#define SVD_reserved3_size 653 +/* ... */ +/* FIXME: validate correctness of last SVD entry offset. */ + +/* Structure of an on-disk directory record. */ +/* Note: ISO9660 stores each multi-byte integer twice, once in + * each byte order. The sizes here are the size of just one + * of the two integers. (This is why the offset of a field isn't + * the same as the offset+size of the previous field.) */ +#define DR_length_offset 0 +#define DR_length_size 1 +#define DR_ext_attr_length_offset 1 +#define DR_ext_attr_length_size 1 +#define DR_extent_offset 2 +#define DR_extent_size 4 +#define DR_size_offset 10 +#define DR_size_size 4 +#define DR_date_offset 18 +#define DR_date_size 7 +#define DR_flags_offset 25 +#define DR_flags_size 1 +#define DR_file_unit_size_offset 26 +#define DR_file_unit_size_size 1 +#define DR_interleave_offset 27 +#define DR_interleave_size 1 +#define DR_volume_sequence_number_offset 28 +#define DR_volume_sequence_number_size 2 +#define DR_name_len_offset 32 +#define DR_name_len_size 1 +#define DR_name_offset 33 + +#ifdef HAVE_ZLIB_H +static const unsigned char zisofs_magic[8] = { + 0x37, 0xE4, 0x53, 0x96, 0xC9, 0xDB, 0xD6, 0x07 +}; + +struct zisofs { + /* Set 1 if this file compressed by paged zlib */ + int pz; + int pz_log2_bs; /* Log2 of block size */ + uint64_t pz_uncompressed_size; + + int initialized; + unsigned char *uncompressed_buffer; + size_t uncompressed_buffer_size; + + uint32_t pz_offset; + unsigned char header[16]; + size_t header_avail; + int header_passed; + unsigned char *block_pointers; + size_t block_pointers_alloc; + size_t block_pointers_size; + size_t block_pointers_avail; + size_t block_off; + uint32_t block_avail; + + z_stream stream; + int stream_valid; +}; +#else +struct zisofs { + /* Set 1 if this file compressed by paged zlib */ + int pz; +}; +#endif + +struct content { + uint64_t offset;/* Offset on disk. */ + uint64_t size; /* File size in bytes. */ + struct content *next; +}; + +/* In-memory storage for a directory record. */ +struct file_info { + struct file_info *use_next; + struct file_info *parent; + struct file_info *next; + int subdirs; + uint64_t key; /* Heap Key. */ + uint64_t offset; /* Offset on disk. */ + uint64_t size; /* File size in bytes. */ + uint32_t ce_offset; /* Offset of CE. */ + uint32_t ce_size; /* Size of CE. */ + char re; /* Having RRIP "RE" extension. */ + uint64_t cl_offset; /* Having RRIP "CL" extension. */ + int birthtime_is_set; + time_t birthtime; /* File created time. */ + time_t mtime; /* File last modified time. */ + time_t atime; /* File last accessed time. */ + time_t ctime; /* File attribute change time. */ + uint64_t rdev; /* Device number. */ + mode_t mode; + uid_t uid; + gid_t gid; + int64_t number; + int nlinks; + struct archive_string name; /* Pathname */ + char name_continues; /* Non-zero if name continues */ + struct archive_string symlink; + char symlink_continues; /* Non-zero if link continues */ + /* Set 1 if this file compressed by paged zlib(zisofs) */ + int pz; + int pz_log2_bs; /* Log2 of block size */ + uint64_t pz_uncompressed_size; + /* Set 1 if this file is multi extent. */ + int multi_extent; + struct { + struct content *first; + struct content **last; + } contents; + char exposed; +}; + +struct heap_queue { + struct file_info **files; + int allocated; + int used; +}; + +struct iso9660 { + int magic; +#define ISO9660_MAGIC 0x96609660 + + int opt_support_joliet; + int opt_support_rockridge; + + struct archive_string pathname; + char seenRockridge; /* Set true if RR extensions are used. */ + char seenSUSP; /* Set true if SUSP is beging used. */ + char seenJoliet; + + unsigned char suspOffset; + struct file_info *rr_moved; + struct heap_queue re_dirs; + struct heap_queue cl_files; + struct read_ce_queue { + struct read_ce_req { + uint64_t offset;/* Offset of CE on disk. */ + struct file_info *file; + } *reqs; + int cnt; + int allocated; + } read_ce_req; + + int64_t previous_number; + struct archive_string previous_pathname; + + struct file_info *use_files; + struct heap_queue pending_files; + struct { + struct file_info *first; + struct file_info **last; + } cache_files; + + uint64_t current_position; + ssize_t logical_block_size; + uint64_t volume_size; /* Total size of volume in bytes. */ + int32_t volume_block;/* Total size of volume in logical blocks. */ + + struct vd { + int location; /* Location of Extent. */ + uint32_t size; + } primary, joliet; + + off_t entry_sparse_offset; + int64_t entry_bytes_remaining; + struct zisofs entry_zisofs; + struct content *entry_content; +}; + +static int archive_read_format_iso9660_bid(struct archive_read *); +static int archive_read_format_iso9660_options(struct archive_read *, + const char *, const char *); +static int archive_read_format_iso9660_cleanup(struct archive_read *); +static int archive_read_format_iso9660_read_data(struct archive_read *, + const void **, size_t *, off_t *); +static int archive_read_format_iso9660_read_data_skip(struct archive_read *); +static int archive_read_format_iso9660_read_header(struct archive_read *, + struct archive_entry *); +static const char *build_pathname(struct archive_string *, struct file_info *); +#if DEBUG +static void dump_isodirrec(FILE *, const unsigned char *isodirrec); +#endif +static time_t time_from_tm(struct tm *); +static time_t isodate17(const unsigned char *); +static time_t isodate7(const unsigned char *); +static int isBootRecord(struct iso9660 *, const unsigned char *); +static int isVolumePartition(struct iso9660 *, const unsigned char *); +static int isVDSetTerminator(struct iso9660 *, const unsigned char *); +static int isJolietSVD(struct iso9660 *, const unsigned char *); +static int isSVD(struct iso9660 *, const unsigned char *); +static int isEVD(struct iso9660 *, const unsigned char *); +static int isPVD(struct iso9660 *, const unsigned char *); +static struct file_info *next_cache_entry(struct iso9660 *iso9660); +static int next_entry_seek(struct archive_read *a, struct iso9660 *iso9660, + struct file_info **pfile); +static struct file_info * + parse_file_info(struct archive_read *a, + struct file_info *parent, const unsigned char *isodirrec); +static int parse_rockridge(struct archive_read *a, + struct file_info *file, const unsigned char *start, + const unsigned char *end); +static int register_CE(struct archive_read *a, int32_t location, + struct file_info *file); +static int read_CE(struct archive_read *a, struct iso9660 *iso9660); +static void parse_rockridge_NM1(struct file_info *, + const unsigned char *, int); +static void parse_rockridge_SL1(struct file_info *, + const unsigned char *, int); +static void parse_rockridge_TF1(struct file_info *, + const unsigned char *, int); +static void parse_rockridge_ZF1(struct file_info *, + const unsigned char *, int); +static void register_file(struct iso9660 *, struct file_info *); +static void release_files(struct iso9660 *); +static unsigned toi(const void *p, int n); +static inline void cache_add_entry(struct iso9660 *iso9660, + struct file_info *file); +static inline void cache_add_to_next_of_parent(struct iso9660 *iso9660, + struct file_info *file); +static inline struct file_info *cache_get_entry(struct iso9660 *iso9660); +static void heap_add_entry(struct heap_queue *heap, + struct file_info *file, uint64_t key); +static struct file_info *heap_get_entry(struct heap_queue *heap); + +#define add_entry(iso9660, file) \ + heap_add_entry(&((iso9660)->pending_files), file, file->offset) +#define next_entry(iso9660) \ + heap_get_entry(&((iso9660)->pending_files)) + +int +archive_read_support_format_iso9660(struct archive *_a) +{ + struct archive_read *a = (struct archive_read *)_a; + struct iso9660 *iso9660; + int r; + + iso9660 = (struct iso9660 *)malloc(sizeof(*iso9660)); + if (iso9660 == NULL) { + archive_set_error(&a->archive, ENOMEM, "Can't allocate iso9660 data"); + return (ARCHIVE_FATAL); + } + memset(iso9660, 0, sizeof(*iso9660)); + iso9660->magic = ISO9660_MAGIC; + iso9660->cache_files.first = NULL; + iso9660->cache_files.last = &(iso9660->cache_files.first); + /* Enable to support Joliet extensions by default. */ + iso9660->opt_support_joliet = 1; + /* Enable to support Rock Ridge extensions by default. */ + iso9660->opt_support_rockridge = 1; + + r = __archive_read_register_format(a, + iso9660, + "iso9660", + archive_read_format_iso9660_bid, + archive_read_format_iso9660_options, + archive_read_format_iso9660_read_header, + archive_read_format_iso9660_read_data, + archive_read_format_iso9660_read_data_skip, + archive_read_format_iso9660_cleanup); + + if (r != ARCHIVE_OK) { + free(iso9660); + return (r); + } + return (ARCHIVE_OK); +} + + +static int +archive_read_format_iso9660_bid(struct archive_read *a) +{ + struct iso9660 *iso9660; + ssize_t bytes_read; + const void *h; + const unsigned char *p; + int seenTerminator; + + iso9660 = (struct iso9660 *)(a->format->data); + + /* + * Skip the first 32k (reserved area) and get the first + * 8 sectors of the volume descriptor table. Of course, + * if the I/O layer gives us more, we'll take it. + */ +#define RESERVED_AREA (SYSTEM_AREA_BLOCK * LOGICAL_BLOCK_SIZE) + h = __archive_read_ahead(a, + RESERVED_AREA + 8 * LOGICAL_BLOCK_SIZE, + &bytes_read); + if (h == NULL) + return (-1); + p = (const unsigned char *)h; + + /* Skip the reserved area. */ + bytes_read -= RESERVED_AREA; + p += RESERVED_AREA; + + /* Check each volume descriptor. */ + seenTerminator = 0; + for (; bytes_read > LOGICAL_BLOCK_SIZE; + bytes_read -= LOGICAL_BLOCK_SIZE, p += LOGICAL_BLOCK_SIZE) { + /* Do not handle undefined Volume Descriptor Type. */ + if (p[0] >= 4 && p[0] <= 254) + return (0); + /* Standard Identifier must be "CD001" */ + if (memcmp(p + 1, "CD001", 5) != 0) + return (0); + if (!iso9660->primary.location) { + if (isPVD(iso9660, p)) + continue; + } + if (!iso9660->joliet.location) { + if (isJolietSVD(iso9660, p)) + continue; + } + if (isBootRecord(iso9660, p)) + continue; + if (isEVD(iso9660, p)) + continue; + if (isSVD(iso9660, p)) + continue; + if (isVolumePartition(iso9660, p)) + continue; + if (isVDSetTerminator(iso9660, p)) { + seenTerminator = 1; + break; + } + return (0); + } + /* + * ISO 9660 format must have Primary Volume Descriptor and + * Volume Descriptor Set Terminator. + */ + if (seenTerminator && iso9660->primary.location > 16) + return (48); + + /* We didn't find a valid PVD; return a bid of zero. */ + return (0); +} + +static int +archive_read_format_iso9660_options(struct archive_read *a, + const char *key, const char *val) +{ + struct iso9660 *iso9660; + + iso9660 = (struct iso9660 *)(a->format->data); + + if (strcmp(key, "joliet") == 0) { + if (val == NULL || strcmp(val, "off") == 0 || + strcmp(val, "ignore") == 0 || + strcmp(val, "disable") == 0 || + strcmp(val, "0") == 0) + iso9660->opt_support_joliet = 0; + else + iso9660->opt_support_joliet = 1; + return (ARCHIVE_OK); + } + if (strcmp(key, "rockridge") == 0 || + strcmp(key, "Rockridge") == 0) { + iso9660->opt_support_rockridge = val != NULL; + return (ARCHIVE_OK); + } + + /* Note: The "warn" return is just to inform the options + * supervisor that we didn't handle it. It will generate + * a suitable error if noone used this option. */ + return (ARCHIVE_WARN); +} + +static int +isBootRecord(struct iso9660 *iso9660, const unsigned char *h) +{ + (void)iso9660; /* UNUSED */ + + /* Type of the Volume Descriptor Boot Record must be 0. */ + if (h[0] != 0) + return (0); + + /* Volume Descriptor Version must be 1. */ + if (h[6] != 1) + return (0); + + return (1); +} + +static int +isVolumePartition(struct iso9660 *iso9660, const unsigned char *h) +{ + int32_t location; + + /* Type of the Volume Partition Descriptor must be 3. */ + if (h[0] != 3) + return (0); + + /* Volume Descriptor Version must be 1. */ + if (h[6] != 1) + return (0); + /* Unused Field */ + if (h[7] != 0) + return (0); + + location = archive_le32dec(h + 72); + if (location <= SYSTEM_AREA_BLOCK || + location >= iso9660->volume_block) + return (0); + if ((uint32_t)location != archive_be32dec(h + 76)) + return (0); + + return (1); +} + +static int +isVDSetTerminator(struct iso9660 *iso9660, const unsigned char *h) +{ + int i; + + (void)iso9660; /* UNUSED */ + + /* Type of the Volume Descriptor Set Terminator must be 255. */ + if (h[0] != 255) + return (0); + + /* Volume Descriptor Version must be 1. */ + if (h[6] != 1) + return (0); + + /* Reserved field must be 0. */ + for (i = 7; i < 2048; ++i) + if (h[i] != 0) + return (0); + + return (1); +} + +static int +isJolietSVD(struct iso9660 *iso9660, const unsigned char *h) +{ + const unsigned char *p; + ssize_t logical_block_size; + int32_t volume_block; + + /* Check if current sector is a kind of Supplementary Volume + * Descriptor. */ + if (!isSVD(iso9660, h)) + return (0); + + /* FIXME: do more validations according to joliet spec. */ + + /* check if this SVD contains joliet extension! */ + p = h + SVD_escape_sequences_offset; + /* N.B. Joliet spec says p[1] == '\\', but.... */ + if (p[0] == '%' && p[1] == '/') { + int level = 0; + + if (p[2] == '@') + level = 1; + else if (p[2] == 'C') + level = 2; + else if (p[2] == 'E') + level = 3; + else /* not joliet */ + return (0); + + iso9660->seenJoliet = level; + + } else /* not joliet */ + return (0); + + logical_block_size = + archive_le16dec(h + SVD_logical_block_size_offset); + volume_block = archive_le32dec(h + SVD_volume_space_size_offset); + + iso9660->logical_block_size = logical_block_size; + iso9660->volume_block = volume_block; + iso9660->volume_size = logical_block_size * (uint64_t)volume_block; + /* Read Root Directory Record in Volume Descriptor. */ + p = h + SVD_root_directory_record_offset; + iso9660->joliet.location = archive_le32dec(p + DR_extent_offset); + iso9660->joliet.size = archive_le32dec(p + DR_size_offset); + + return (48); +} + +static int +isSVD(struct iso9660 *iso9660, const unsigned char *h) +{ + const unsigned char *p; + ssize_t logical_block_size; + int32_t volume_block; + int32_t location; + int i; + + (void)iso9660; /* UNUSED */ + + /* Type 2 means it's a SVD. */ + if (h[SVD_type_offset] != 2) + return (0); + + /* Reserved field must be 0. */ + for (i = 0; i < SVD_reserved1_size; ++i) + if (h[SVD_reserved1_offset + i] != 0) + return (0); + for (i = 0; i < SVD_reserved2_size; ++i) + if (h[SVD_reserved2_offset + i] != 0) + return (0); + for (i = 0; i < SVD_reserved3_size; ++i) + if (h[SVD_reserved3_offset + i] != 0) + return (0); + + /* File structure version must be 1 for ISO9660/ECMA119. */ + if (h[SVD_file_structure_version_offset] != 1) + return (0); + + logical_block_size = + archive_le16dec(h + SVD_logical_block_size_offset); + if (logical_block_size <= 0) + return (0); + + volume_block = archive_le32dec(h + SVD_volume_space_size_offset); + if (volume_block <= SYSTEM_AREA_BLOCK+4) + return (0); + + /* Location of Occurrence of Type L Path Table must be + * available location, + * > SYSTEM_AREA_BLOCK(16) + 2 and < Volume Space Size. */ + location = archive_le32dec(h+SVD_type_L_path_table_offset); + if (location <= SYSTEM_AREA_BLOCK+2 || location >= volume_block) + return (0); + + /* Location of Occurrence of Type M Path Table must be + * available location, + * > SYSTEM_AREA_BLOCK(16) + 2 and < Volume Space Size. */ + location = archive_be32dec(h+SVD_type_M_path_table_offset); + if (location <= SYSTEM_AREA_BLOCK+2 || location >= volume_block) + return (0); + + /* Read Root Directory Record in Volume Descriptor. */ + p = h + SVD_root_directory_record_offset; + if (p[DR_length_offset] != 34) + return (0); + + return (48); +} + +static int +isEVD(struct iso9660 *iso9660, const unsigned char *h) +{ + const unsigned char *p; + ssize_t logical_block_size; + int32_t volume_block; + int32_t location; + int i; + + (void)iso9660; /* UNUSED */ + + /* Type of the Enhanced Volume Descriptor must be 2. */ + if (h[PVD_type_offset] != 2) + return (0); + + /* EVD version must be 2. */ + if (h[PVD_version_offset] != 2) + return (0); + + /* Reserved field must be 0. */ + if (h[PVD_reserved1_offset] != 0) + return (0); + + /* Reserved field must be 0. */ + for (i = 0; i < PVD_reserved2_size; ++i) + if (h[PVD_reserved2_offset + i] != 0) + return (0); + + /* Reserved field must be 0. */ + for (i = 0; i < PVD_reserved3_size; ++i) + if (h[PVD_reserved3_offset + i] != 0) + return (0); + + /* Logical block size must be > 0. */ + /* I've looked at Ecma 119 and can't find any stronger + * restriction on this field. */ + logical_block_size = + archive_le16dec(h + PVD_logical_block_size_offset); + if (logical_block_size <= 0) + return (0); + + volume_block = + archive_le32dec(h + PVD_volume_space_size_offset); + if (volume_block <= SYSTEM_AREA_BLOCK+4) + return (0); + + /* File structure version must be 2 for ISO9660:1999. */ + if (h[PVD_file_structure_version_offset] != 2) + return (0); + + /* Location of Occurrence of Type L Path Table must be + * available location, + * > SYSTEM_AREA_BLOCK(16) + 2 and < Volume Space Size. */ + location = archive_le32dec(h+PVD_type_1_path_table_offset); + if (location <= SYSTEM_AREA_BLOCK+2 || location >= volume_block) + return (0); + + /* Location of Occurrence of Type M Path Table must be + * available location, + * > SYSTEM_AREA_BLOCK(16) + 2 and < Volume Space Size. */ + location = archive_be32dec(h+PVD_type_m_path_table_offset); + if (location <= SYSTEM_AREA_BLOCK+2 || location >= volume_block) + return (0); + + /* Reserved field must be 0. */ + for (i = 0; i < PVD_reserved4_size; ++i) + if (h[PVD_reserved4_offset + i] != 0) + return (0); + + /* Reserved field must be 0. */ + for (i = 0; i < PVD_reserved5_size; ++i) + if (h[PVD_reserved5_offset + i] != 0) + return (0); + + /* Read Root Directory Record in Volume Descriptor. */ + p = h + PVD_root_directory_record_offset; + if (p[DR_length_offset] != 34) + return (0); + + return (48); +} + +static int +isPVD(struct iso9660 *iso9660, const unsigned char *h) +{ + const unsigned char *p; + ssize_t logical_block_size; + int32_t volume_block; + int32_t location; + int i; + + /* Type of the Primary Volume Descriptor must be 1. */ + if (h[PVD_type_offset] != 1) + return (0); + + /* PVD version must be 1. */ + if (h[PVD_version_offset] != 1) + return (0); + + /* Reserved field must be 0. */ + if (h[PVD_reserved1_offset] != 0) + return (0); + + /* Reserved field must be 0. */ + for (i = 0; i < PVD_reserved2_size; ++i) + if (h[PVD_reserved2_offset + i] != 0) + return (0); + + /* Reserved field must be 0. */ + for (i = 0; i < PVD_reserved3_size; ++i) + if (h[PVD_reserved3_offset + i] != 0) + return (0); + + /* Logical block size must be > 0. */ + /* I've looked at Ecma 119 and can't find any stronger + * restriction on this field. */ + logical_block_size = + archive_le16dec(h + PVD_logical_block_size_offset); + if (logical_block_size <= 0) + return (0); + + volume_block = archive_le32dec(h + PVD_volume_space_size_offset); + if (volume_block <= SYSTEM_AREA_BLOCK+4) + return (0); + + /* File structure version must be 1 for ISO9660/ECMA119. */ + if (h[PVD_file_structure_version_offset] != 1) + return (0); + + /* Location of Occurrence of Type L Path Table must be + * available location, + * > SYSTEM_AREA_BLOCK(16) + 2 and < Volume Space Size. */ + location = archive_le32dec(h+PVD_type_1_path_table_offset); + if (location <= SYSTEM_AREA_BLOCK+2 || location >= volume_block) + return (0); + + /* Location of Occurrence of Type M Path Table must be + * available location, + * > SYSTEM_AREA_BLOCK(16) + 2 and < Volume Space Size. */ + location = archive_be32dec(h+PVD_type_m_path_table_offset); + if (location <= SYSTEM_AREA_BLOCK+2 || location >= volume_block) + return (0); + + /* Reserved field must be 0. */ + for (i = 0; i < PVD_reserved4_size; ++i) + if (h[PVD_reserved4_offset + i] != 0) + return (0); + + /* Reserved field must be 0. */ + for (i = 0; i < PVD_reserved5_size; ++i) + if (h[PVD_reserved5_offset + i] != 0) + return (0); + + /* XXX TODO: Check other values for sanity; reject more + * malformed PVDs. XXX */ + + /* Read Root Directory Record in Volume Descriptor. */ + p = h + PVD_root_directory_record_offset; + if (p[DR_length_offset] != 34) + return (0); + + iso9660->logical_block_size = logical_block_size; + iso9660->volume_block = volume_block; + iso9660->volume_size = logical_block_size * (uint64_t)volume_block; + iso9660->primary.location = archive_le32dec(p + DR_extent_offset); + iso9660->primary.size = archive_le32dec(p + DR_size_offset); + + return (48); +} + +static int +read_children(struct archive_read *a, struct file_info *parent) +{ + struct iso9660 *iso9660; + const unsigned char *b, *p; + struct file_info *multi; + size_t step; + + iso9660 = (struct iso9660 *)(a->format->data); + if (iso9660->current_position > parent->offset) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Ignoring out-of-order directory (%s) %jd > %jd", + parent->name.s, + iso9660->current_position, + parent->offset); + return (ARCHIVE_WARN); + } + if (parent->offset + parent->size > iso9660->volume_size) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Directory is beyond end-of-media: %s", + parent->name); + return (ARCHIVE_WARN); + } + if (iso9660->current_position < parent->offset) { + int64_t skipsize; + + skipsize = parent->offset - iso9660->current_position; + skipsize = __archive_read_skip(a, skipsize); + if (skipsize < 0) + return ((int)skipsize); + iso9660->current_position = parent->offset; + } + + step = ((parent->size + iso9660->logical_block_size -1) / + iso9660->logical_block_size) * iso9660->logical_block_size; + b = __archive_read_ahead(a, step, NULL); + if (b == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Failed to read full block when scanning " + "ISO9660 directory list"); + return (ARCHIVE_FATAL); + } + __archive_read_consume(a, step); + iso9660->current_position += step; + multi = NULL; + while (step) { + p = b; + b += iso9660->logical_block_size; + step -= iso9660->logical_block_size; + for (; *p != 0 && p < b && p + *p <= b; p += *p) { + struct file_info *child; + + /* N.B.: these special directory identifiers + * are 8 bit "values" even on a + * Joliet CD with UCS-2 (16bit) encoding. + */ + + /* Skip '.' entry. */ + if (*(p + DR_name_len_offset) == 1 + && *(p + DR_name_offset) == '\0') + continue; + /* Skip '..' entry. */ + if (*(p + DR_name_len_offset) == 1 + && *(p + DR_name_offset) == '\001') + continue; + child = parse_file_info(a, parent, p); + if (child == NULL) + return (ARCHIVE_FATAL); + if (child->cl_offset) + heap_add_entry(&(iso9660->cl_files), + child, child->cl_offset); + else { + if (child->multi_extent || multi != NULL) { + struct content *con; + + if (multi == NULL) { + multi = child; + multi->contents.first = NULL; + multi->contents.last = + &(multi->contents.first); + } + con = malloc(sizeof(struct content)); + if (con == NULL) { + archive_set_error( + &a->archive, ENOMEM, + "No memory for " + "multi extent"); + return (ARCHIVE_FATAL); + } + con->offset = child->offset; + con->size = child->size; + con->next = NULL; + *multi->contents.last = con; + multi->contents.last = &(con->next); + if (multi == child) + add_entry(iso9660, child); + else { + multi->size += child->size; + if (!child->multi_extent) + multi = NULL; + } + } else + add_entry(iso9660, child); + } + } + } + + /* Read data which recorded by RRIP "CE" extension. */ + if (read_CE(a, iso9660) != ARCHIVE_OK) + return (ARCHIVE_FATAL); + + return (ARCHIVE_OK); +} + +static int +relocate_dir(struct iso9660 *iso9660, struct file_info *file) +{ + struct file_info *re; + + re = heap_get_entry(&(iso9660->re_dirs)); + while (re != NULL && re->offset < file->cl_offset) { + /* This case is wrong pattern. + * But dont't reject this directory entry to be robust. */ + cache_add_entry(iso9660, re); + re = heap_get_entry(&(iso9660->re_dirs)); + } + if (re == NULL) + /* This case is wrong pattern. */ + return (0); + if (re->offset == file->cl_offset) { + re->parent->subdirs--; + re->parent = file->parent; + re->parent->subdirs++; + cache_add_to_next_of_parent(iso9660, re); + return (1); + } else + /* This case is wrong pattern. */ + heap_add_entry(&(iso9660->re_dirs), re, re->offset); + return (0); +} + +static int +read_entries(struct archive_read *a) +{ + struct iso9660 *iso9660; + struct file_info *file; + int r; + + iso9660 = (struct iso9660 *)(a->format->data); + + while ((file = next_entry(iso9660)) != NULL && + (file->mode & AE_IFMT) == AE_IFDIR) { + r = read_children(a, file); + if (r != ARCHIVE_OK) + return (r); + + if (iso9660->seenRockridge && + file->parent != NULL && + file->parent->parent == NULL && + iso9660->rr_moved == NULL && + (strcmp(file->name.s, "rr_moved") == 0 || + strcmp(file->name.s, ".rr_moved") == 0)) { + iso9660->rr_moved = file; + } else if (file->re) + heap_add_entry(&(iso9660->re_dirs), file, + file->offset); + else + cache_add_entry(iso9660, file); + } + if (file != NULL) + add_entry(iso9660, file); + + if (iso9660->rr_moved != NULL) { + /* + * Relocate directory which rr_moved has. + */ + while ((file = heap_get_entry(&(iso9660->cl_files))) != NULL) + relocate_dir(iso9660, file); + + /* If rr_moved directory still has children, + * Add rr_moved into pending_files to show + */ + if (iso9660->rr_moved->subdirs) { + cache_add_entry(iso9660, iso9660->rr_moved); + /* If entries which have "RE" extension are still + * remaining(this case is unlikely except ISO image + * is broken), the entries won't be exposed. */ + while ((file = heap_get_entry(&(iso9660->re_dirs))) != NULL) + cache_add_entry(iso9660, file); + } else + iso9660->rr_moved->parent->subdirs--; + } else { + /* + * In case ISO image is broken. If the name of rr_moved + * directory has been changed by damage, subdirectories + * of rr_moved entry won't be exposed. + */ + while ((file = heap_get_entry(&(iso9660->re_dirs))) != NULL) + cache_add_entry(iso9660, file); + } + + return (ARCHIVE_OK); +} + +static int +archive_read_format_iso9660_read_header(struct archive_read *a, + struct archive_entry *entry) +{ + struct iso9660 *iso9660; + struct file_info *file; + int r, rd_r; + + iso9660 = (struct iso9660 *)(a->format->data); + + if (!a->archive.archive_format) { + a->archive.archive_format = ARCHIVE_FORMAT_ISO9660; + a->archive.archive_format_name = "ISO9660"; + } + + if (iso9660->current_position == 0) { + int64_t skipsize; + struct vd *vd; + const void *block; + char seenJoliet; + + vd = &(iso9660->primary); + if (!iso9660->opt_support_joliet) + iso9660->seenJoliet = 0; + if (iso9660->seenJoliet && + vd->location > iso9660->joliet.location) + /* This condition is unlikely; by way of caution. */ + vd = &(iso9660->joliet); + + skipsize = LOGICAL_BLOCK_SIZE * vd->location; + skipsize = __archive_read_skip(a, skipsize); + if (skipsize < 0) + return ((int)skipsize); + iso9660->current_position = skipsize; + + block = __archive_read_ahead(a, vd->size, NULL); + if (block == NULL) { + archive_set_error(&a->archive, + ARCHIVE_ERRNO_MISC, + "Failed to read full block when scanning " + "ISO9660 directory list"); + return (ARCHIVE_FATAL); + } + + /* + * While reading Root Directory, flag seenJoliet + * must be zero to avoid converting special name + * 0x00(Current Directory) and next byte to UCS2. + */ + seenJoliet = iso9660->seenJoliet;/* Save flag. */ + iso9660->seenJoliet = 0; + file = parse_file_info(a, NULL, block); + if (file == NULL) + return (ARCHIVE_FATAL); + iso9660->seenJoliet = seenJoliet; + if (vd == &(iso9660->primary) && iso9660->seenRockridge + && iso9660->seenJoliet) + /* + * If iso image has RockRidge and Joliet, + * we use RockRidge Extensions. + */ + iso9660->seenJoliet = 0; + if (vd == &(iso9660->primary) && !iso9660->seenRockridge + && iso9660->seenJoliet) { + /* Switch reading data from primary to joliet. */ + vd = &(iso9660->joliet); + skipsize = LOGICAL_BLOCK_SIZE * vd->location; + skipsize -= iso9660->current_position; + skipsize = __archive_read_skip(a, skipsize); + if (skipsize < 0) + return ((int)skipsize); + iso9660->current_position += skipsize; + + block = __archive_read_ahead(a, vd->size, NULL); + if (block == NULL) { + archive_set_error(&a->archive, + ARCHIVE_ERRNO_MISC, + "Failed to read full block when scanning " + "ISO9660 directory list"); + return (ARCHIVE_FATAL); + } + seenJoliet = iso9660->seenJoliet;/* Save flag. */ + iso9660->seenJoliet = 0; + file = parse_file_info(a, NULL, block); + if (file == NULL) + return (ARCHIVE_FATAL); + iso9660->seenJoliet = seenJoliet; + } + /* Store the root directory in the pending list. */ + add_entry(iso9660, file); + if (iso9660->seenRockridge) { + a->archive.archive_format = + ARCHIVE_FORMAT_ISO9660_ROCKRIDGE; + a->archive.archive_format_name = + "ISO9660 with Rockridge extensions"; + } + rd_r = read_entries(a); + if (rd_r == ARCHIVE_FATAL) + return (ARCHIVE_FATAL); + } else + rd_r = ARCHIVE_OK; + + /* Get the next entry that appears after the current offset. */ + r = next_entry_seek(a, iso9660, &file); + if (r != ARCHIVE_OK) + return (r); + + iso9660->entry_bytes_remaining = file->size; + iso9660->entry_sparse_offset = 0; /* Offset for sparse-file-aware clients. */ + + if (file->offset + file->size > iso9660->volume_size) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "File is beyond end-of-media: %s", file->name); + iso9660->entry_bytes_remaining = 0; + iso9660->entry_sparse_offset = 0; + return (ARCHIVE_WARN); + } + + /* Set up the entry structure with information about this entry. */ + archive_entry_set_mode(entry, file->mode); + archive_entry_set_uid(entry, file->uid); + archive_entry_set_gid(entry, file->gid); + archive_entry_set_nlink(entry, file->nlinks); + if (file->birthtime_is_set) + archive_entry_set_birthtime(entry, file->birthtime, 0); + else + archive_entry_unset_birthtime(entry); + archive_entry_set_mtime(entry, file->mtime, 0); + archive_entry_set_ctime(entry, file->ctime, 0); + archive_entry_set_atime(entry, file->atime, 0); + /* N.B.: Rock Ridge supports 64-bit device numbers. */ + archive_entry_set_rdev(entry, (dev_t)file->rdev); + archive_entry_set_size(entry, iso9660->entry_bytes_remaining); + archive_string_empty(&iso9660->pathname); + archive_entry_set_pathname(entry, + build_pathname(&iso9660->pathname, file)); + if (file->symlink.s != NULL) + archive_entry_copy_symlink(entry, file->symlink.s); + + /* Note: If the input isn't seekable, we can't rewind to + * return the same body again, so if the next entry refers to + * the same data, we have to return it as a hardlink to the + * original entry. */ + if (file->number != -1 && + file->number == iso9660->previous_number) { + archive_entry_set_hardlink(entry, + iso9660->previous_pathname.s); + archive_entry_unset_size(entry); + iso9660->entry_bytes_remaining = 0; + iso9660->entry_sparse_offset = 0; + return (ARCHIVE_OK); + } + + /* Except for the hardlink case above, if the offset of the + * next entry is before our current position, we can't seek + * backwards to extract it, so issue a warning. Note that + * this can only happen if this entry was added to the heap + * after we passed this offset, that is, only if the directory + * mentioning this entry is later than the body of the entry. + * Such layouts are very unusual; most ISO9660 writers lay out + * and record all directory information first, then store + * all file bodies. */ + /* TODO: Someday, libarchive's I/O core will support optional + * seeking. When that day comes, this code should attempt to + * seek and only return the error if the seek fails. That + * will give us support for whacky ISO images that require + * seeking while retaining the ability to read almost all ISO + * images in a streaming fashion. */ + if ((file->mode & AE_IFMT) != AE_IFDIR && + file->offset < iso9660->current_position) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Ignoring out-of-order file @%x (%s) %jd < %jd", + file, + iso9660->pathname.s, + file->offset, iso9660->current_position); + iso9660->entry_bytes_remaining = 0; + iso9660->entry_sparse_offset = 0; + return (ARCHIVE_WARN); + } + + /* Initialize zisofs variables. */ + iso9660->entry_zisofs.pz = file->pz; + if (file->pz) { +#ifdef HAVE_ZLIB_H + struct zisofs *zisofs; + + zisofs = &iso9660->entry_zisofs; + zisofs->initialized = 0; + zisofs->pz_log2_bs = file->pz_log2_bs; + zisofs->pz_uncompressed_size = file->pz_uncompressed_size; + zisofs->pz_offset = 0; + zisofs->header_avail = 0; + zisofs->header_passed = 0; + zisofs->block_pointers_avail = 0; +#endif + archive_entry_set_size(entry, file->pz_uncompressed_size); + } + + iso9660->previous_number = file->number; + archive_strcpy(&iso9660->previous_pathname, iso9660->pathname.s); + + /* Reset entry_bytes_remaining if the file is multi extent. */ + iso9660->entry_content = file->contents.first; + if (iso9660->entry_content != NULL) + iso9660->entry_bytes_remaining = iso9660->entry_content->size; + + if (archive_entry_filetype(entry) == AE_IFDIR) { + /* Overwrite nlinks by proper link number which is + * calculated from number of sub directories. */ + archive_entry_set_nlink(entry, 2 + file->subdirs); + /* Directory data has been read completely. */ + iso9660->entry_bytes_remaining = 0; + iso9660->entry_sparse_offset = 0; + file->exposed = 1; + } + + if (rd_r != ARCHIVE_OK) + return (rd_r); + return (ARCHIVE_OK); +} + +static int +archive_read_format_iso9660_read_data_skip(struct archive_read *a) +{ + /* Because read_next_header always does an explicit skip + * to the next entry, we don't need to do anything here. */ + (void)a; /* UNUSED */ + return (ARCHIVE_OK); +} + +#ifdef HAVE_ZLIB_H + +static int +zisofs_read_data(struct archive_read *a, + const void **buff, size_t *size, off_t *offset) +{ + struct iso9660 *iso9660; + struct zisofs *zisofs; + const unsigned char *p; + size_t avail; + ssize_t bytes_read; + size_t uncompressed_size; + int r; + + iso9660 = (struct iso9660 *)(a->format->data); + zisofs = &iso9660->entry_zisofs; + + p = __archive_read_ahead(a, 1, &bytes_read); + if (bytes_read <= 0) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Truncated zisofs file body"); + return (ARCHIVE_FATAL); + } + if (bytes_read > iso9660->entry_bytes_remaining) + bytes_read = iso9660->entry_bytes_remaining; + avail = bytes_read; + uncompressed_size = 0; + + if (!zisofs->initialized) { + size_t ceil, xsize; + + /* Allocate block pointers buffer. */ + ceil = (zisofs->pz_uncompressed_size + + (1LL << zisofs->pz_log2_bs) - 1) + >> zisofs->pz_log2_bs; + xsize = (ceil + 1) * 4; + if (zisofs->block_pointers_alloc < xsize) { + size_t alloc; + + if (zisofs->block_pointers != NULL) + free(zisofs->block_pointers); + alloc = ((xsize >> 10) + 1) << 10; + zisofs->block_pointers = malloc(alloc); + if (zisofs->block_pointers == NULL) { + archive_set_error(&a->archive, ENOMEM, + "No memory for zisofs decompression"); + return (ARCHIVE_FATAL); + } + zisofs->block_pointers_alloc = alloc; + } + zisofs->block_pointers_size = xsize; + + /* Allocate uncompressed data buffer. */ + xsize = 1UL << zisofs->pz_log2_bs; + if (zisofs->uncompressed_buffer_size < xsize) { + if (zisofs->uncompressed_buffer != NULL) + free(zisofs->uncompressed_buffer); + zisofs->uncompressed_buffer = malloc(xsize); + if (zisofs->uncompressed_buffer == NULL) { + archive_set_error(&a->archive, ENOMEM, + "No memory for zisofs decompression"); + return (ARCHIVE_FATAL); + } + } + zisofs->uncompressed_buffer_size = xsize; + + /* + * Read the file header, and check the magic code of zisofs. + */ + if (zisofs->header_avail < sizeof(zisofs->header)) { + xsize = sizeof(zisofs->header) - zisofs->header_avail; + if (avail < xsize) + xsize = avail; + memcpy(zisofs->header + zisofs->header_avail, p, xsize); + zisofs->header_avail += xsize; + avail -= xsize; + p += xsize; + } + if (!zisofs->header_passed && + zisofs->header_avail == sizeof(zisofs->header)) { + int err = 0; + + if (memcmp(zisofs->header, zisofs_magic, + sizeof(zisofs_magic)) != 0) + err = 1; + if (archive_le32dec(zisofs->header + 8) + != zisofs->pz_uncompressed_size) + err = 1; + if (zisofs->header[12] != 4) + err = 1; + if (zisofs->header[13] != zisofs->pz_log2_bs) + err = 1; + if (err) { + archive_set_error(&a->archive, + ARCHIVE_ERRNO_FILE_FORMAT, + "Illegal zisofs file body"); + return (ARCHIVE_FATAL); + } + zisofs->header_passed = 1; + } + /* + * Read block pointers. + */ + if (zisofs->header_passed && + zisofs->block_pointers_avail < zisofs->block_pointers_size) { + xsize = zisofs->block_pointers_size + - zisofs->block_pointers_avail; + if (avail < xsize) + xsize = avail; + memcpy(zisofs->block_pointers + + zisofs->block_pointers_avail, p, xsize); + zisofs->block_pointers_avail += xsize; + avail -= xsize; + p += xsize; + if (zisofs->block_pointers_avail + == zisofs->block_pointers_size) { + /* We've got all block pointers and initialize + * related variables. */ + zisofs->block_off = 0; + zisofs->block_avail = 0; + /* Complete a initialization */ + zisofs->initialized = 1; + } + } + + if (!zisofs->initialized) + goto next_data; /* We need more datas. */ + } + + /* + * Get block offsets from block pointers. + */ + if (zisofs->block_avail == 0) { + uint32_t bst, bed; + + if (zisofs->block_off + 4 >= zisofs->block_pointers_size) { + /* There isn't a pair of offsets. */ + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Illegal zisofs block pointers"); + return (ARCHIVE_FATAL); + } + bst = archive_le32dec(zisofs->block_pointers + zisofs->block_off); + if (bst != zisofs->pz_offset + (bytes_read - avail)) { + /* TODO: Should we seek offset of current file by bst ? */ + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Illegal zisofs block pointers(cannot seek)"); + return (ARCHIVE_FATAL); + } + bed = archive_le32dec( + zisofs->block_pointers + zisofs->block_off + 4); + if (bed < bst) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Illegal zisofs block pointers"); + return (ARCHIVE_FATAL); + } + zisofs->block_avail = bed - bst; + zisofs->block_off += 4; + + /* Initialize compression library for new block. */ + if (zisofs->stream_valid) + r = inflateReset(&zisofs->stream); + else + r = inflateInit(&zisofs->stream); + if (r != Z_OK) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Can't initialize zisofs decompression."); + return (ARCHIVE_FATAL); + } + zisofs->stream_valid = 1; + zisofs->stream.total_in = 0; + zisofs->stream.total_out = 0; + } + + /* + * Make uncompressed datas. + */ + if (zisofs->block_avail == 0) { + memset(zisofs->uncompressed_buffer, 0, + zisofs->uncompressed_buffer_size); + uncompressed_size = zisofs->uncompressed_buffer_size; + } else { + zisofs->stream.next_in = (Bytef *)(uintptr_t)(const void *)p; + if (avail > zisofs->block_avail) + zisofs->stream.avail_in = zisofs->block_avail; + else + zisofs->stream.avail_in = avail; + zisofs->stream.next_out = zisofs->uncompressed_buffer; + zisofs->stream.avail_out = zisofs->uncompressed_buffer_size; + + r = inflate(&zisofs->stream, 0); + switch (r) { + case Z_OK: /* Decompressor made some progress.*/ + case Z_STREAM_END: /* Found end of stream. */ + break; + default: + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "zisofs decompression failed (%d)", r); + return (ARCHIVE_FATAL); + } + uncompressed_size = + zisofs->uncompressed_buffer_size - zisofs->stream.avail_out; + avail -= zisofs->stream.next_in - p; + zisofs->block_avail -= zisofs->stream.next_in - p; + } +next_data: + bytes_read -= avail; + *buff = zisofs->uncompressed_buffer; + *size = uncompressed_size; + *offset = iso9660->entry_sparse_offset; + iso9660->entry_sparse_offset += uncompressed_size; + iso9660->entry_bytes_remaining -= bytes_read; + iso9660->current_position += bytes_read; + zisofs->pz_offset += bytes_read; + __archive_read_consume(a, bytes_read); + + return (ARCHIVE_OK); +} + +#else /* HAVE_ZLIB_H */ + +static int +zisofs_read_data(struct archive_read *a, + const void **buff, size_t *size, off_t *offset) +{ + + (void)buff;/* UNUSED */ + (void)size;/* UNUSED */ + (void)offset;/* UNUSED */ + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "zisofs is not supported on this platform."); + return (ARCHIVE_FAILED); +} + +#endif /* HAVE_ZLIB_H */ + +static int +archive_read_format_iso9660_read_data(struct archive_read *a, + const void **buff, size_t *size, off_t *offset) +{ + ssize_t bytes_read; + struct iso9660 *iso9660; + + iso9660 = (struct iso9660 *)(a->format->data); + if (iso9660->entry_bytes_remaining <= 0) { + if (iso9660->entry_content != NULL) + iso9660->entry_content = iso9660->entry_content->next; + if (iso9660->entry_content == NULL) { + *buff = NULL; + *size = 0; + *offset = iso9660->entry_sparse_offset; + return (ARCHIVE_EOF); + } + /* Seek forward to the start of the entry. */ + if (iso9660->current_position < iso9660->entry_content->offset) { + int64_t step; + + step = iso9660->entry_content->offset - + iso9660->current_position; + step = __archive_read_skip(a, step); + if (step < 0) + return ((int)step); + iso9660->current_position = + iso9660->entry_content->offset; + } + if (iso9660->entry_content->offset < iso9660->current_position) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Ignoring out-of-order file (%s) %jd < %jd", + iso9660->pathname.s, + iso9660->entry_content->offset, + iso9660->current_position); + *buff = NULL; + *size = 0; + *offset = iso9660->entry_sparse_offset; + return (ARCHIVE_WARN); + } + iso9660->entry_bytes_remaining = iso9660->entry_content->size; + } + if (iso9660->entry_zisofs.pz) + return (zisofs_read_data(a, buff, size, offset)); + + *buff = __archive_read_ahead(a, 1, &bytes_read); + if (bytes_read == 0) + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Truncated input file"); + if (*buff == NULL) + return (ARCHIVE_FATAL); + if (bytes_read > iso9660->entry_bytes_remaining) + bytes_read = iso9660->entry_bytes_remaining; + *size = bytes_read; + *offset = iso9660->entry_sparse_offset; + iso9660->entry_sparse_offset += bytes_read; + iso9660->entry_bytes_remaining -= bytes_read; + iso9660->current_position += bytes_read; + __archive_read_consume(a, bytes_read); + return (ARCHIVE_OK); +} + +static int +archive_read_format_iso9660_cleanup(struct archive_read *a) +{ + struct iso9660 *iso9660; + int r = ARCHIVE_OK; + + iso9660 = (struct iso9660 *)(a->format->data); + release_files(iso9660); + free(iso9660->read_ce_req.reqs); + archive_string_free(&iso9660->pathname); + archive_string_free(&iso9660->previous_pathname); + if (iso9660->pending_files.files) + free(iso9660->pending_files.files); + if (iso9660->re_dirs.files) + free(iso9660->re_dirs.files); + if (iso9660->cl_files.files) + free(iso9660->cl_files.files); +#ifdef HAVE_ZLIB_H + free(iso9660->entry_zisofs.uncompressed_buffer); + free(iso9660->entry_zisofs.block_pointers); + if (iso9660->entry_zisofs.stream_valid) { + if (inflateEnd(&iso9660->entry_zisofs.stream) != Z_OK) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Failed to clean up zlib decompressor"); + r = ARCHIVE_FATAL; + } + } +#endif + free(iso9660); + (a->format->data) = NULL; + return (r); +} + +/* + * This routine parses a single ISO directory record, makes sense + * of any extensions, and stores the result in memory. + */ +static struct file_info * +parse_file_info(struct archive_read *a, struct file_info *parent, + const unsigned char *isodirrec) +{ + struct iso9660 *iso9660; + struct file_info *file; + size_t name_len; + const unsigned char *rr_start, *rr_end; + const unsigned char *p; + size_t dr_len; + int32_t location; + int flags; + + iso9660 = (struct iso9660 *)(a->format->data); + + dr_len = (size_t)isodirrec[DR_length_offset]; + name_len = (size_t)isodirrec[DR_name_len_offset]; + location = archive_le32dec(isodirrec + DR_extent_offset); + /* Sanity check that dr_len needs at least 34. */ + if (dr_len < 34) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Invalid length of directory record"); + return (NULL); + } + /* Sanity check that name_len doesn't exceed dr_len. */ + if (dr_len - 33 < name_len || name_len == 0) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Invalid length of file identifier"); + return (NULL); + } + /* Sanity check that location doesn't exceed volume block. + * Don't check lower limit of location; it's possibility + * the location has negative value when file type is symbolic + * link or file size is zero. As far as I know latest mkisofs + * do that. + */ + if (location >= iso9660->volume_block) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Invalid location of extent of file"); + return (NULL); + } + + /* Create a new file entry and copy data from the ISO dir record. */ + file = (struct file_info *)malloc(sizeof(*file)); + if (file == NULL) { + archive_set_error(&a->archive, ENOMEM, + "No memory for file entry"); + return (NULL); + } + memset(file, 0, sizeof(*file)); + file->parent = parent; + file->offset = iso9660->logical_block_size * (uint64_t)location; + file->size = toi(isodirrec + DR_size_offset, DR_size_size); + file->mtime = isodate7(isodirrec + DR_date_offset); + file->ctime = file->atime = file->mtime; + + p = isodirrec + DR_name_offset; + /* Rockridge extensions (if any) follow name. Compute this + * before fidgeting the name_len below. */ + rr_start = p + name_len + (name_len & 1 ? 0 : 1); + rr_end = isodirrec + dr_len; + + if (iso9660->seenJoliet) { + /* Joliet names are max 64 chars (128 bytes) according to spec, + * but genisoimage/mkisofs allows recording longer Joliet + * names which are 103 UCS2 characters(206 bytes) by their + * option '-joliet-long'. + */ + wchar_t wbuff[103+1], *wp; + const unsigned char *c; + + if (name_len > 206) + name_len = 206; + /* convert BE UTF-16 to wchar_t */ + for (c = p, wp = wbuff; + c < (p + name_len) && + wp < (wbuff + sizeof(wbuff)/sizeof(*wbuff) - 1); + c += 2) { + *wp++ = (((255 & (int)c[0]) << 8) | (255 & (int)c[1])); + } + *wp = L'\0'; + +#if 0 /* untested code, is it at all useful on Joliet? */ + /* trim trailing first version and dot from filename. + * + * Remember we where in UTF-16BE land! + * SEPARATOR 1 (.) and SEPARATOR 2 (;) are both + * 16 bits big endian characters on Joliet. + * + * TODO: sanitize filename? + * Joliet allows any UCS-2 char except: + * *, /, :, ;, ? and \. + */ + /* Chop off trailing ';1' from files. */ + if (*(wp-2) == ';' && *(wp-1) == '1') { + wp-=2; + *wp = L'\0'; + } + + /* Chop off trailing '.' from filenames. */ + if (*(wp-1) == '.') + *(--wp) = L'\0'; +#endif + + /* store the result in the file name field. */ + archive_strappend_w_utf8(&file->name, wbuff); + } else { + /* Chop off trailing ';1' from files. */ + if (name_len > 2 && p[name_len - 2] == ';' && + p[name_len - 1] == '1') + name_len -= 2; + /* Chop off trailing '.' from filenames. */ + if (name_len > 1 && p[name_len - 1] == '.') + --name_len; + + archive_strncpy(&file->name, (const char *)p, name_len); + } + + flags = isodirrec[DR_flags_offset]; + if (flags & 0x02) + file->mode = AE_IFDIR | 0700; + else + file->mode = AE_IFREG | 0400; + if (flags & 0x80) + file->multi_extent = 1; + else + file->multi_extent = 0; + /* + * Use location for file number. + * File number is treated as inode number to find out harlink + * target. If Rockridge extensions is being used, file number + * will be overwritten by FILE SERIAL NUMBER of RRIP "PX" + * extension. + * NOTE: Old mkisofs did not record that FILE SERIAL NUMBER + * in ISO images. + */ + if (file->size == 0 && location >= 0) + /* If file->size is zero, its location points wrong place. + * Dot not use it for file number. + * When location has negative value, it can be used + * for file number. + */ + file->number = -1; + else + file->number = (int64_t)(uint32_t)location; + + /* Rockridge extensions overwrite information from above. */ + if (iso9660->opt_support_rockridge) { + if (parent == NULL && rr_end - rr_start >= 7) { + p = rr_start; + if (p[0] == 'S' && p[1] == 'P' + && p[2] == 7 && p[3] == 1 + && p[4] == 0xBE && p[5] == 0xEF) { + /* + * SP extension stores the suspOffset + * (Number of bytes to skip between + * filename and SUSP records.) + * It is mandatory by the SUSP standard + * (IEEE 1281). + * + * It allows SUSP to coexist with + * non-SUSP uses of the System + * Use Area by placing non-SUSP data + * before SUSP data. + * + * SP extension must be in the root + * directory entry, disable all SUSP + * processing if not found. + */ + iso9660->suspOffset = p[6]; + iso9660->seenSUSP = 1; + rr_start += 7; + } + } + if (iso9660->seenSUSP) { + int r; + + file->name_continues = 0; + file->symlink_continues = 0; + rr_start += iso9660->suspOffset; + r = parse_rockridge(a, file, rr_start, rr_end); + if (r != ARCHIVE_OK) { + free(file); + return (NULL); + } + } else + /* If there isn't SUSP, disable parsing + * rock ridge extensions. */ + iso9660->opt_support_rockridge = 0; + } + + file->nlinks = 1;/* Reset nlink. we'll calculate it later. */ + /* Tell file's parent how many children that parent has. */ + if (parent != NULL && (flags & 0x02) && file->cl_offset == 0) + parent->subdirs++; + +#if DEBUG + /* DEBUGGING: Warn about attributes I don't yet fully support. */ + if ((flags & ~0x02) != 0) { + fprintf(stderr, "\n ** Unrecognized flag: "); + dump_isodirrec(stderr, isodirrec); + fprintf(stderr, "\n"); + } else if (toi(isodirrec + DR_volume_sequence_number_offset, 2) != 1) { + fprintf(stderr, "\n ** Unrecognized sequence number: "); + dump_isodirrec(stderr, isodirrec); + fprintf(stderr, "\n"); + } else if (*(isodirrec + DR_file_unit_size_offset) != 0) { + fprintf(stderr, "\n ** Unexpected file unit size: "); + dump_isodirrec(stderr, isodirrec); + fprintf(stderr, "\n"); + } else if (*(isodirrec + DR_interleave_offset) != 0) { + fprintf(stderr, "\n ** Unexpected interleave: "); + dump_isodirrec(stderr, isodirrec); + fprintf(stderr, "\n"); + } else if (*(isodirrec + DR_ext_attr_length_offset) != 0) { + fprintf(stderr, "\n ** Unexpected extended attribute length: "); + dump_isodirrec(stderr, isodirrec); + fprintf(stderr, "\n"); + } +#endif + register_file(iso9660, file); + return (file); +} + +static int +parse_rockridge(struct archive_read *a, struct file_info *file, + const unsigned char *p, const unsigned char *end) +{ + struct iso9660 *iso9660; + + iso9660 = (struct iso9660 *)(a->format->data); + + while (p + 4 <= end /* Enough space for another entry. */ + && p[0] >= 'A' && p[0] <= 'Z' /* Sanity-check 1st char of name. */ + && p[1] >= 'A' && p[1] <= 'Z' /* Sanity-check 2nd char of name. */ + && p[2] >= 4 /* Sanity-check length. */ + && p + p[2] <= end) { /* Sanity-check length. */ + const unsigned char *data = p + 4; + int data_length = p[2] - 4; + int version = p[3]; + + /* + * Yes, each 'if' here does test p[0] again. + * Otherwise, the fall-through handling to catch + * unsupported extensions doesn't work. + */ + switch(p[0]) { + case 'C': + if (p[0] == 'C' && p[1] == 'E') { + if (version == 1 && data_length == 24) { + /* + * CE extension comprises: + * 8 byte sector containing extension + * 8 byte offset w/in above sector + * 8 byte length of continuation + */ + int32_t location = + archive_le32dec(data); + file->ce_offset = + archive_le32dec(data+8); + file->ce_size = + archive_le32dec(data+16); + if (register_CE(a, location, file) + != ARCHIVE_OK) + return (ARCHIVE_FATAL); + } + break; + } + if (p[0] == 'C' && p[1] == 'L') { + if (version == 1 && data_length == 8) { + file->cl_offset = (uint64_t) + iso9660->logical_block_size * + (uint64_t)archive_le32dec(data); + iso9660->seenRockridge = 1; + } + break; + } + /* FALLTHROUGH */ + case 'N': + if (p[0] == 'N' && p[1] == 'M') { + if (version == 1) { + parse_rockridge_NM1(file, + data, data_length); + iso9660->seenRockridge = 1; + } + break; + } + /* FALLTHROUGH */ + case 'P': + if (p[0] == 'P' && p[1] == 'D') { + /* + * PD extension is padding; + * contents are always ignored. + */ + break; + } + if (p[0] == 'P' && p[1] == 'N') { + if (version == 1 && data_length == 16) { + file->rdev = toi(data,4); + file->rdev <<= 32; + file->rdev |= toi(data + 8, 4); + iso9660->seenRockridge = 1; + } + break; + } + if (p[0] == 'P' && p[1] == 'X') { + /* + * PX extension comprises: + * 8 bytes for mode, + * 8 bytes for nlinks, + * 8 bytes for uid, + * 8 bytes for gid, + * 8 bytes for inode. + */ + if (version == 1) { + if (data_length >= 8) + file->mode + = toi(data, 4); + if (data_length >= 16) + file->nlinks + = toi(data + 8, 4); + if (data_length >= 24) + file->uid + = toi(data + 16, 4); + if (data_length >= 32) + file->gid + = toi(data + 24, 4); + if (data_length >= 40) + file->number + = toi(data + 32, 4); + iso9660->seenRockridge = 1; + } + break; + } + /* FALLTHROUGH */ + case 'R': + if (p[0] == 'R' && p[1] == 'E' && version == 1) { + file->re = 1; + iso9660->seenRockridge = 1; + break; + } + if (p[0] == 'R' && p[1] == 'R' && version == 1) { + /* + * RR extension comprises: + * one byte flag value + * This extension is obsolete, + * so contents are always ignored. + */ + break; + } + /* FALLTHROUGH */ + case 'S': + if (p[0] == 'S' && p[1] == 'L') { + if (version == 1) { + parse_rockridge_SL1(file, + data, data_length); + iso9660->seenRockridge = 1; + } + break; + } + if (p[0] == 'S' && p[1] == 'T' + && data_length == 0 && version == 1) { + /* + * ST extension marks end of this + * block of SUSP entries. + * + * It allows SUSP to coexist with + * non-SUSP uses of the System + * Use Area by placing non-SUSP data + * after SUSP data. + */ + iso9660->seenSUSP = 0; + iso9660->seenRockridge = 0; + return (ARCHIVE_OK); + } + case 'T': + if (p[0] == 'T' && p[1] == 'F') { + if (version == 1) { + parse_rockridge_TF1(file, + data, data_length); + iso9660->seenRockridge = 1; + } + break; + } + /* FALLTHROUGH */ + case 'Z': + if (p[0] == 'Z' && p[1] == 'F') { + if (version == 1) + parse_rockridge_ZF1(file, + data, data_length); + break; + } + /* FALLTHROUGH */ + default: + /* The FALLTHROUGHs above leave us here for + * any unsupported extension. */ + break; + } + + + + p += p[2]; + } + return (ARCHIVE_OK); +} + +static int +register_CE(struct archive_read *a, int32_t location, + struct file_info *file) +{ + struct iso9660 *iso9660; + struct read_ce_queue *heap; + struct read_ce_req *p; + uint64_t offset, parent_offset; + int hole, parent; + + iso9660 = (struct iso9660 *)(a->format->data); + offset = ((uint64_t)location) * (uint64_t)iso9660->logical_block_size; + if (((file->mode & AE_IFMT) == AE_IFREG && + offset >= file->offset) || + offset < iso9660->current_position) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Invalid location in SUSP \"CE\" extension"); + return (ARCHIVE_FATAL); + } + + /* Expand our CE list as necessary. */ + heap = &(iso9660->read_ce_req); + if (heap->cnt >= heap->allocated) { + int new_size; + + if (heap->allocated < 16) + new_size = 16; + else + new_size = heap->allocated * 2; + /* Overflow might keep us from growing the list. */ + if (new_size <= heap->allocated) + __archive_errx(1, "Out of memory"); + p = malloc(new_size * sizeof(p[0])); + if (p == NULL) + __archive_errx(1, "Out of memory"); + if (heap->reqs != NULL) { + memcpy(p, heap->reqs, heap->cnt * sizeof(*p)); + free(heap->reqs); + } + heap->reqs = p; + heap->allocated = new_size; + } + + /* + * Start with hole at end, walk it up tree to find insertion point. + */ + hole = heap->cnt++; + while (hole > 0) { + parent = (hole - 1)/2; + parent_offset = heap->reqs[parent].offset; + if (offset >= parent_offset) { + heap->reqs[hole].offset = offset; + heap->reqs[hole].file = file; + return (ARCHIVE_OK); + } + // Move parent into hole <==> move hole up tree. + heap->reqs[hole] = heap->reqs[parent]; + hole = parent; + } + heap->reqs[0].offset = offset; + heap->reqs[0].file = file; + return (ARCHIVE_OK); +} + +static void +next_CE(struct read_ce_queue *heap) +{ + uint64_t a_offset, b_offset, c_offset; + int a, b, c; + struct read_ce_req tmp; + + if (heap->cnt < 1) + return; + + /* + * Move the last item in the heap to the root of the tree + */ + heap->reqs[0] = heap->reqs[--(heap->cnt)]; + + /* + * Rebalance the heap. + */ + a = 0; // Starting element and its offset + a_offset = heap->reqs[a].offset; + for (;;) { + b = a + a + 1; // First child + if (b >= heap->cnt) + return; + b_offset = heap->reqs[b].offset; + c = b + 1; // Use second child if it is smaller. + if (c < heap->cnt) { + c_offset = heap->reqs[c].offset; + if (c_offset < b_offset) { + b = c; + b_offset = c_offset; + } + } + if (a_offset <= b_offset) + return; + tmp = heap->reqs[a]; + heap->reqs[a] = heap->reqs[b]; + heap->reqs[b] = tmp; + a = b; + } +} + + +static int +read_CE(struct archive_read *a, struct iso9660 *iso9660) +{ + struct read_ce_queue *heap; + const unsigned char *b, *p, *end; + struct file_info *file; + size_t step; + int r; + + /* Read data which RRIP "CE" extension points. */ + heap = &(iso9660->read_ce_req); + step = iso9660->logical_block_size; + while (heap->cnt && + heap->reqs[0].offset == iso9660->current_position) { + b = __archive_read_ahead(a, step, NULL); + if (b == NULL) { + archive_set_error(&a->archive, + ARCHIVE_ERRNO_MISC, + "Failed to read full block when scanning " + "ISO9660 directory list"); + return (ARCHIVE_FATAL); + } + do { + file = heap->reqs[0].file; + p = b + file->ce_offset; + end = p + file->ce_size; + next_CE(heap); + r = parse_rockridge(a, file, p, end); + if (r != ARCHIVE_OK) + return (ARCHIVE_FATAL); + } while (heap->cnt && + heap->reqs[0].offset == iso9660->current_position); + /* NOTE: Do not move this consume's code to fron of + * do-while loop. Registration of nested CE extension + * might cause error because of current position. */ + __archive_read_consume(a, step); + iso9660->current_position += step; + } + return (ARCHIVE_OK); +} + +static void +parse_rockridge_NM1(struct file_info *file, + const unsigned char *data, int data_length) +{ + if (!file->name_continues) + archive_string_empty(&file->name); + file->name_continues = 0; + if (data_length < 1) + return; + /* + * NM version 1 extension comprises: + * 1 byte flag, value is one of: + * = 0: remainder is name + * = 1: remainder is name, next NM entry continues name + * = 2: "." + * = 4: ".." + * = 32: Implementation specific + * All other values are reserved. + */ + switch(data[0]) { + case 0: + if (data_length < 2) + return; + archive_strncat(&file->name, (const char *)data + 1, data_length - 1); + break; + case 1: + if (data_length < 2) + return; + archive_strncat(&file->name, (const char *)data + 1, data_length - 1); + file->name_continues = 1; + break; + case 2: + archive_strcat(&file->name, "."); + break; + case 4: + archive_strcat(&file->name, ".."); + break; + default: + return; + } + +} + +static void +parse_rockridge_TF1(struct file_info *file, const unsigned char *data, + int data_length) +{ + char flag; + /* + * TF extension comprises: + * one byte flag + * create time (optional) + * modify time (optional) + * access time (optional) + * attribute time (optional) + * Time format and presence of fields + * is controlled by flag bits. + */ + if (data_length < 1) + return; + flag = data[0]; + ++data; + --data_length; + if (flag & 0x80) { + /* Use 17-byte time format. */ + if ((flag & 1) && data_length >= 17) { + /* Create time. */ + file->birthtime_is_set = 1; + file->birthtime = isodate17(data); + data += 17; + data_length -= 17; + } + if ((flag & 2) && data_length >= 17) { + /* Modify time. */ + file->mtime = isodate17(data); + data += 17; + data_length -= 17; + } + if ((flag & 4) && data_length >= 17) { + /* Access time. */ + file->atime = isodate17(data); + data += 17; + data_length -= 17; + } + if ((flag & 8) && data_length >= 17) { + /* Attribute change time. */ + file->ctime = isodate17(data); + } + } else { + /* Use 7-byte time format. */ + if ((flag & 1) && data_length >= 7) { + /* Create time. */ + file->birthtime_is_set = 1; + file->birthtime = isodate7(data); + data += 7; + data_length -= 7; + } + if ((flag & 2) && data_length >= 7) { + /* Modify time. */ + file->mtime = isodate7(data); + data += 7; + data_length -= 7; + } + if ((flag & 4) && data_length >= 7) { + /* Access time. */ + file->atime = isodate7(data); + data += 7; + data_length -= 7; + } + if ((flag & 8) && data_length >= 7) { + /* Attribute change time. */ + file->ctime = isodate7(data); + } + } +} + +static void +parse_rockridge_SL1(struct file_info *file, const unsigned char *data, + int data_length) +{ + const char *separator = ""; + + if (!file->symlink_continues || file->symlink.length < 1) + archive_string_empty(&file->symlink); + else if (!file->symlink_continues && + file->symlink.s[file->symlink.length - 1] != '/') + separator = "/"; + file->symlink_continues = 0; + + /* + * Defined flag values: + * 0: This is the last SL record for this symbolic link + * 1: this symbolic link field continues in next SL entry + * All other values are reserved. + */ + if (data_length < 1) + return; + switch(*data) { + case 0: + break; + case 1: + file->symlink_continues = 1; + break; + default: + return; + } + ++data; /* Skip flag byte. */ + --data_length; + + /* + * SL extension body stores "components". + * Basically, this is a complicated way of storing + * a POSIX path. It also interferes with using + * symlinks for storing non-path data. + * + * Each component is 2 bytes (flag and length) + * possibly followed by name data. + */ + while (data_length >= 2) { + unsigned char flag = *data++; + unsigned char nlen = *data++; + data_length -= 2; + + archive_strcat(&file->symlink, separator); + separator = "/"; + + switch(flag) { + case 0: /* Usual case, this is text. */ + if (data_length < nlen) + return; + archive_strncat(&file->symlink, + (const char *)data, nlen); + break; + case 0x01: /* Text continues in next component. */ + if (data_length < nlen) + return; + archive_strncat(&file->symlink, + (const char *)data, nlen); + separator = ""; + break; + case 0x02: /* Current dir. */ + archive_strcat(&file->symlink, "."); + break; + case 0x04: /* Parent dir. */ + archive_strcat(&file->symlink, ".."); + break; + case 0x08: /* Root of filesystem. */ + archive_strcat(&file->symlink, "/"); + separator = ""; + break; + case 0x10: /* Undefined (historically "volume root" */ + archive_string_empty(&file->symlink); + archive_strcat(&file->symlink, "ROOT"); + break; + case 0x20: /* Undefined (historically "hostname") */ + archive_strcat(&file->symlink, "hostname"); + break; + default: + /* TODO: issue a warning ? */ + return; + } + data += nlen; + data_length -= nlen; + } +} + +static void +parse_rockridge_ZF1(struct file_info *file, const unsigned char *data, + int data_length) +{ + + if (data[0] == 0x70 && data[1] == 0x7a && data_length == 12) { + /* paged zlib */ + file->pz = 1; + file->pz_log2_bs = data[3]; + file->pz_uncompressed_size = archive_le32dec(&data[4]); + } +} + +static void +register_file(struct iso9660 *iso9660, struct file_info *file) +{ + + file->use_next = iso9660->use_files; + iso9660->use_files = file; +} + +static void +release_files(struct iso9660 *iso9660) +{ + struct content *con, *connext; + struct file_info *file; + + file = iso9660->use_files; + while (file != NULL) { + struct file_info *next = file->use_next; + + archive_string_free(&file->name); + archive_string_free(&file->symlink); + con = file->contents.first; + while (con != NULL) { + connext = con->next; + free(con); + con = connext; + } + free(file); + file = next; + } +} + +static int +next_entry_seek(struct archive_read *a, struct iso9660 *iso9660, + struct file_info **pfile) +{ + struct file_info *file; + + *pfile = file = next_cache_entry(iso9660); + if (file == NULL) + return (ARCHIVE_EOF); + + /* Don't waste time seeking for zero-length bodies. */ + if (file->size == 0) + file->offset = iso9660->current_position; + + /* Seek forward to the start of the entry. */ + if (iso9660->current_position < file->offset) { + int64_t step; + + step = file->offset - iso9660->current_position; + step = __archive_read_skip(a, step); + if (step < 0) + return ((int)step); + iso9660->current_position = file->offset; + } + + /* We found body of file; handle it now. */ + return (ARCHIVE_OK); +} + +static struct file_info * +next_cache_entry(struct iso9660 *iso9660) +{ + struct file_info *file; + struct { + struct file_info *first; + struct file_info **last; + } empty_files; + int64_t number; + int count; + + file = cache_get_entry(iso9660); + if (file != NULL) { + while (file->parent != NULL && !file->parent->exposed) { + /* If file's parent is not exposed, it's moved + * to next entry of its parent. */ + cache_add_to_next_of_parent(iso9660, file); + file = cache_get_entry(iso9660); + } + return (file); + } + + file = next_entry(iso9660); + if (file == NULL) + return (NULL); + + if ((file->mode & AE_IFMT) != AE_IFREG || file->number == -1) + return (file); + + count = 0; + number = file->number; + iso9660->cache_files.first = NULL; + iso9660->cache_files.last = &(iso9660->cache_files.first); + empty_files.first = NULL; + empty_files.last = &empty_files.first; + /* Collect files which has the same file serial number. + * Peek pending_files so that file which number is different + * is not put bak. */ + while (iso9660->pending_files.used > 0 && + (iso9660->pending_files.files[0]->number == -1 || + iso9660->pending_files.files[0]->number == number)) { + if (file->number == -1) { + /* This file has the same offset + * but it's wrong offset which empty files + * and symlink files have. + * NOTE: This wrong offse was recorded by + * old mkisofs utility. If ISO images is + * created by latest mkisofs, this does not + * happen. + */ + file->next = NULL; + *empty_files.last = file; + empty_files.last = &(file->next); + } else { + count++; + cache_add_entry(iso9660, file); + } + file = next_entry(iso9660); + } + + if (count == 0) + return (file); + if (file->number == -1) { + file->next = NULL; + *empty_files.last = file; + empty_files.last = &(file->next); + } else { + count++; + cache_add_entry(iso9660, file); + } + + if (count > 1) { + /* The count is the same as number of hardlink, + * so much so that each nlinks of files in cache_file + * is overwritten by value of the count. + */ + for (file = iso9660->cache_files.first; + file != NULL; file = file->next) + file->nlinks = count; + } + /* If there are empty files, that files are added + * to the tail of the cache_files. */ + if (empty_files.first != NULL) { + *iso9660->cache_files.last = empty_files.first; + iso9660->cache_files.last = empty_files.last; + } + return (cache_get_entry(iso9660)); +} + +static inline void +cache_add_entry(struct iso9660 *iso9660, struct file_info *file) +{ + file->next = NULL; + *iso9660->cache_files.last = file; + iso9660->cache_files.last = &(file->next); +} + +static inline void +cache_add_to_next_of_parent(struct iso9660 *iso9660, struct file_info *file) +{ + file->next = file->parent->next; + file->parent->next = file; + if (iso9660->cache_files.last == &(file->parent->next)) + iso9660->cache_files.last = &(file->next); +} + +static inline struct file_info * +cache_get_entry(struct iso9660 *iso9660) +{ + struct file_info *file; + + if ((file = iso9660->cache_files.first) != NULL) { + iso9660->cache_files.first = file->next; + if (iso9660->cache_files.first == NULL) + iso9660->cache_files.last = &(iso9660->cache_files.first); + } + return (file); +} + +static void +heap_add_entry(struct heap_queue *heap, struct file_info *file, uint64_t key) +{ + uint64_t file_key, parent_key; + int hole, parent; + + /* Expand our pending files list as necessary. */ + if (heap->used >= heap->allocated) { + struct file_info **new_pending_files; + int new_size = heap->allocated * 2; + + if (heap->allocated < 1024) + new_size = 1024; + /* Overflow might keep us from growing the list. */ + if (new_size <= heap->allocated) + __archive_errx(1, "Out of memory"); + new_pending_files = (struct file_info **) + malloc(new_size * sizeof(new_pending_files[0])); + if (new_pending_files == NULL) + __archive_errx(1, "Out of memory"); + memcpy(new_pending_files, heap->files, + heap->allocated * sizeof(new_pending_files[0])); + if (heap->files != NULL) + free(heap->files); + heap->files = new_pending_files; + heap->allocated = new_size; + } + + file_key = file->key = key; + + /* + * Start with hole at end, walk it up tree to find insertion point. + */ + hole = heap->used++; + while (hole > 0) { + parent = (hole - 1)/2; + parent_key = heap->files[parent]->key; + if (file_key >= parent_key) { + heap->files[hole] = file; + return; + } + // Move parent into hole <==> move hole up tree. + heap->files[hole] = heap->files[parent]; + hole = parent; + } + heap->files[0] = file; +} + +static struct file_info * +heap_get_entry(struct heap_queue *heap) +{ + uint64_t a_key, b_key, c_key; + int a, b, c; + struct file_info *r, *tmp; + + if (heap->used < 1) + return (NULL); + + /* + * The first file in the list is the earliest; we'll return this. + */ + r = heap->files[0]; + + /* + * Move the last item in the heap to the root of the tree + */ + heap->files[0] = heap->files[--(heap->used)]; + + /* + * Rebalance the heap. + */ + a = 0; // Starting element and its heap key + a_key = heap->files[a]->key; + for (;;) { + b = a + a + 1; // First child + if (b >= heap->used) + return (r); + b_key = heap->files[b]->key; + c = b + 1; // Use second child if it is smaller. + if (c < heap->used) { + c_key = heap->files[c]->key; + if (c_key < b_key) { + b = c; + b_key = c_key; + } + } + if (a_key <= b_key) + return (r); + tmp = heap->files[a]; + heap->files[a] = heap->files[b]; + heap->files[b] = tmp; + a = b; + } +} + +static unsigned int +toi(const void *p, int n) +{ + const unsigned char *v = (const unsigned char *)p; + if (n > 1) + return v[0] + 256 * toi(v + 1, n - 1); + if (n == 1) + return v[0]; + return (0); +} + +static time_t +isodate7(const unsigned char *v) +{ + struct tm tm; + int offset; + memset(&tm, 0, sizeof(tm)); + tm.tm_year = v[0]; + tm.tm_mon = v[1] - 1; + tm.tm_mday = v[2]; + tm.tm_hour = v[3]; + tm.tm_min = v[4]; + tm.tm_sec = v[5]; + /* v[6] is the signed timezone offset, in 1/4-hour increments. */ + offset = ((const signed char *)v)[6]; + if (offset > -48 && offset < 52) { + tm.tm_hour -= offset / 4; + tm.tm_min -= (offset % 4) * 15; + } + return (time_from_tm(&tm)); +} + +static time_t +isodate17(const unsigned char *v) +{ + struct tm tm; + int offset; + memset(&tm, 0, sizeof(tm)); + tm.tm_year = (v[0] - '0') * 1000 + (v[1] - '0') * 100 + + (v[2] - '0') * 10 + (v[3] - '0') + - 1900; + tm.tm_mon = (v[4] - '0') * 10 + (v[5] - '0'); + tm.tm_mday = (v[6] - '0') * 10 + (v[7] - '0'); + tm.tm_hour = (v[8] - '0') * 10 + (v[9] - '0'); + tm.tm_min = (v[10] - '0') * 10 + (v[11] - '0'); + tm.tm_sec = (v[12] - '0') * 10 + (v[13] - '0'); + /* v[16] is the signed timezone offset, in 1/4-hour increments. */ + offset = ((const signed char *)v)[16]; + if (offset > -48 && offset < 52) { + tm.tm_hour -= offset / 4; + tm.tm_min -= (offset % 4) * 15; + } + return (time_from_tm(&tm)); +} + +static time_t +time_from_tm(struct tm *t) +{ +#if HAVE_TIMEGM + /* Use platform timegm() if available. */ + return (timegm(t)); +#else + /* Else use direct calculation using POSIX assumptions. */ + /* First, fix up tm_yday based on the year/month/day. */ + mktime(t); + /* Then we can compute timegm() from first principles. */ + return (t->tm_sec + t->tm_min * 60 + t->tm_hour * 3600 + + t->tm_yday * 86400 + (t->tm_year - 70) * 31536000 + + ((t->tm_year - 69) / 4) * 86400 - + ((t->tm_year - 1) / 100) * 86400 + + ((t->tm_year + 299) / 400) * 86400); +#endif +} + +static const char * +build_pathname(struct archive_string *as, struct file_info *file) +{ + if (file->parent != NULL && archive_strlen(&file->parent->name) > 0) { + build_pathname(as, file->parent); + archive_strcat(as, "/"); + } + if (archive_strlen(&file->name) == 0) + archive_strcat(as, "."); + else + archive_string_concat(as, &file->name); + return (as->s); +} + +#if DEBUG +static void +dump_isodirrec(FILE *out, const unsigned char *isodirrec) +{ + fprintf(out, " l %d,", + toi(isodirrec + DR_length_offset, DR_length_size)); + fprintf(out, " a %d,", + toi(isodirrec + DR_ext_attr_length_offset, DR_ext_attr_length_size)); + fprintf(out, " ext 0x%x,", + toi(isodirrec + DR_extent_offset, DR_extent_size)); + fprintf(out, " s %d,", + toi(isodirrec + DR_size_offset, DR_extent_size)); + fprintf(out, " f 0x%02x,", + toi(isodirrec + DR_flags_offset, DR_flags_size)); + fprintf(out, " u %d,", + toi(isodirrec + DR_file_unit_size_offset, DR_file_unit_size_size)); + fprintf(out, " ilv %d,", + toi(isodirrec + DR_interleave_offset, DR_interleave_size)); + fprintf(out, " seq %d,", + toi(isodirrec + DR_volume_sequence_number_offset, DR_volume_sequence_number_size)); + fprintf(out, " nl %d:", + toi(isodirrec + DR_name_len_offset, DR_name_len_size)); + fprintf(out, " `%.*s'", + toi(isodirrec + DR_name_len_offset, DR_name_len_size), isodirrec + DR_name_offset); +} +#endif diff --git a/lib/libarchive/archive_read_support_format_mtree.c b/lib/libarchive/archive_read_support_format_mtree.c new file mode 100644 index 000000000..6c35298e2 --- /dev/null +++ b/lib/libarchive/archive_read_support_format_mtree.c @@ -0,0 +1,1423 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * Copyright (c) 2008 Joerg Sonnenberger + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_support_format_mtree.c 201165 2009-12-29 05:52:13Z kientzle $"); + +#ifdef HAVE_SYS_STAT_H +#include +#endif +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_FCNTL_H +#include +#endif +#include +/* #include */ /* See archive_platform.h */ +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" +#include "archive_read_private.h" +#include "archive_string.h" + +#ifndef O_BINARY +#define O_BINARY 0 +#endif + +#define MTREE_HAS_DEVICE 0x0001 +#define MTREE_HAS_FFLAGS 0x0002 +#define MTREE_HAS_GID 0x0004 +#define MTREE_HAS_GNAME 0x0008 +#define MTREE_HAS_MTIME 0x0010 +#define MTREE_HAS_NLINK 0x0020 +#define MTREE_HAS_PERM 0x0040 +#define MTREE_HAS_SIZE 0x0080 +#define MTREE_HAS_TYPE 0x0100 +#define MTREE_HAS_UID 0x0200 +#define MTREE_HAS_UNAME 0x0400 + +#define MTREE_HAS_OPTIONAL 0x0800 + +struct mtree_option { + struct mtree_option *next; + char *value; +}; + +struct mtree_entry { + struct mtree_entry *next; + struct mtree_option *options; + char *name; + char full; + char used; +}; + +struct mtree { + struct archive_string line; + size_t buffsize; + char *buff; + off_t offset; + int fd; + int filetype; + int archive_format; + const char *archive_format_name; + struct mtree_entry *entries; + struct mtree_entry *this_entry; + struct archive_string current_dir; + struct archive_string contents_name; + + struct archive_entry_linkresolver *resolver; + + off_t cur_size, cur_offset; +}; + +static int cleanup(struct archive_read *); +static int mtree_bid(struct archive_read *); +static int parse_file(struct archive_read *, struct archive_entry *, + struct mtree *, struct mtree_entry *, int *); +static void parse_escapes(char *, struct mtree_entry *); +static int parse_line(struct archive_read *, struct archive_entry *, + struct mtree *, struct mtree_entry *, int *); +static int parse_keyword(struct archive_read *, struct mtree *, + struct archive_entry *, struct mtree_option *, int *); +static int read_data(struct archive_read *a, + const void **buff, size_t *size, off_t *offset); +static ssize_t readline(struct archive_read *, struct mtree *, char **, ssize_t); +static int skip(struct archive_read *a); +static int read_header(struct archive_read *, + struct archive_entry *); +#ifndef __minix +static int64_t mtree_atol10(char **); +static int64_t mtree_atol8(char **); +static int64_t mtree_atol(char **); +#else +static int32_t mtree_atol10(char **); +static int32_t mtree_atol8(char **); +static int32_t mtree_atol(char **); +#endif + +static void +free_options(struct mtree_option *head) +{ + struct mtree_option *next; + + for (; head != NULL; head = next) { + next = head->next; + free(head->value); + free(head); + } +} + +int +archive_read_support_format_mtree(struct archive *_a) +{ + struct archive_read *a = (struct archive_read *)_a; + struct mtree *mtree; + int r; + + mtree = (struct mtree *)malloc(sizeof(*mtree)); + if (mtree == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate mtree data"); + return (ARCHIVE_FATAL); + } + memset(mtree, 0, sizeof(*mtree)); + mtree->fd = -1; + + r = __archive_read_register_format(a, mtree, "mtree", + mtree_bid, NULL, read_header, read_data, skip, cleanup); + + if (r != ARCHIVE_OK) + free(mtree); + return (ARCHIVE_OK); +} + +static int +cleanup(struct archive_read *a) +{ + struct mtree *mtree; + struct mtree_entry *p, *q; + + mtree = (struct mtree *)(a->format->data); + + p = mtree->entries; + while (p != NULL) { + q = p->next; + free(p->name); + free_options(p->options); + free(p); + p = q; + } + archive_string_free(&mtree->line); + archive_string_free(&mtree->current_dir); + archive_string_free(&mtree->contents_name); + archive_entry_linkresolver_free(mtree->resolver); + + free(mtree->buff); + free(mtree); + (a->format->data) = NULL; + return (ARCHIVE_OK); +} + + +static int +mtree_bid(struct archive_read *a) +{ + const char *signature = "#mtree"; + const char *p; + + /* Now let's look at the actual header and see if it matches. */ + p = __archive_read_ahead(a, strlen(signature), NULL); + if (p == NULL) + return (-1); + + if (strncmp(p, signature, strlen(signature)) == 0) + return (8 * (int)strlen(signature)); + return (0); +} + +/* + * The extended mtree format permits multiple lines specifying + * attributes for each file. For those entries, only the last line + * is actually used. Practically speaking, that means we have + * to read the entire mtree file into memory up front. + * + * The parsing is done in two steps. First, it is decided if a line + * changes the global defaults and if it is, processed accordingly. + * Otherwise, the options of the line are merged with the current + * global options. + */ +static int +add_option(struct archive_read *a, struct mtree_option **global, + const char *value, size_t len) +{ + struct mtree_option *option; + + if ((option = malloc(sizeof(*option))) == NULL) { + archive_set_error(&a->archive, errno, "Can't allocate memory"); + return (ARCHIVE_FATAL); + } + if ((option->value = malloc(len + 1)) == NULL) { + free(option); + archive_set_error(&a->archive, errno, "Can't allocate memory"); + return (ARCHIVE_FATAL); + } + memcpy(option->value, value, len); + option->value[len] = '\0'; + option->next = *global; + *global = option; + return (ARCHIVE_OK); +} + +static void +remove_option(struct mtree_option **global, const char *value, size_t len) +{ + struct mtree_option *iter, *last; + + last = NULL; + for (iter = *global; iter != NULL; last = iter, iter = iter->next) { + if (strncmp(iter->value, value, len) == 0 && + (iter->value[len] == '\0' || + iter->value[len] == '=')) + break; + } + if (iter == NULL) + return; + if (last == NULL) + *global = iter->next; + else + last->next = iter->next; + + free(iter->value); + free(iter); +} + +static int +process_global_set(struct archive_read *a, + struct mtree_option **global, const char *line) +{ + const char *next, *eq; + size_t len; + int r; + + line += 4; + for (;;) { + next = line + strspn(line, " \t\r\n"); + if (*next == '\0') + return (ARCHIVE_OK); + line = next; + next = line + strcspn(line, " \t\r\n"); + eq = strchr(line, '='); + if (eq > next) + len = next - line; + else + len = eq - line; + + remove_option(global, line, len); + r = add_option(a, global, line, next - line); + if (r != ARCHIVE_OK) + return (r); + line = next; + } +} + +static int +process_global_unset(struct archive_read *a, + struct mtree_option **global, const char *line) +{ + const char *next; + size_t len; + + line += 6; + if (strchr(line, '=') != NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "/unset shall not contain `='"); + return ARCHIVE_FATAL; + } + + for (;;) { + next = line + strspn(line, " \t\r\n"); + if (*next == '\0') + return (ARCHIVE_OK); + line = next; + len = strcspn(line, " \t\r\n"); + + if (len == 3 && strncmp(line, "all", 3) == 0) { + free_options(*global); + *global = NULL; + } else { + remove_option(global, line, len); + } + + line += len; + } +} + +static int +process_add_entry(struct archive_read *a, struct mtree *mtree, + struct mtree_option **global, const char *line, + struct mtree_entry **last_entry) +{ + struct mtree_entry *entry; + struct mtree_option *iter; + const char *next, *eq; + size_t len; + int r; + + if ((entry = malloc(sizeof(*entry))) == NULL) { + archive_set_error(&a->archive, errno, "Can't allocate memory"); + return (ARCHIVE_FATAL); + } + entry->next = NULL; + entry->options = NULL; + entry->name = NULL; + entry->used = 0; + entry->full = 0; + + /* Add this entry to list. */ + if (*last_entry == NULL) + mtree->entries = entry; + else + (*last_entry)->next = entry; + *last_entry = entry; + + len = strcspn(line, " \t\r\n"); + if ((entry->name = malloc(len + 1)) == NULL) { + archive_set_error(&a->archive, errno, "Can't allocate memory"); + return (ARCHIVE_FATAL); + } + + memcpy(entry->name, line, len); + entry->name[len] = '\0'; + parse_escapes(entry->name, entry); + + line += len; + for (iter = *global; iter != NULL; iter = iter->next) { + r = add_option(a, &entry->options, iter->value, + strlen(iter->value)); + if (r != ARCHIVE_OK) + return (r); + } + + for (;;) { + next = line + strspn(line, " \t\r\n"); + if (*next == '\0') + return (ARCHIVE_OK); + line = next; + next = line + strcspn(line, " \t\r\n"); + eq = strchr(line, '='); + if (eq > next) + len = next - line; + else + len = eq - line; + + remove_option(&entry->options, line, len); + r = add_option(a, &entry->options, line, next - line); + if (r != ARCHIVE_OK) + return (r); + line = next; + } +} + +static int +read_mtree(struct archive_read *a, struct mtree *mtree) +{ + ssize_t len; + uintmax_t counter; + char *p; + struct mtree_option *global; + struct mtree_entry *last_entry; + int r; + + mtree->archive_format = ARCHIVE_FORMAT_MTREE; + mtree->archive_format_name = "mtree"; + + global = NULL; + last_entry = NULL; + + for (counter = 1; ; ++counter) { + len = readline(a, mtree, &p, 256); + if (len == 0) { + mtree->this_entry = mtree->entries; + free_options(global); + return (ARCHIVE_OK); + } + if (len < 0) { + free_options(global); + return (len); + } + /* Leading whitespace is never significant, ignore it. */ + while (*p == ' ' || *p == '\t') { + ++p; + --len; + } + /* Skip content lines and blank lines. */ + if (*p == '#') + continue; + if (*p == '\r' || *p == '\n' || *p == '\0') + continue; + if (*p != '/') { + r = process_add_entry(a, mtree, &global, p, + &last_entry); + } else if (strncmp(p, "/set", 4) == 0) { + if (p[4] != ' ' && p[4] != '\t') + break; + r = process_global_set(a, &global, p); + } else if (strncmp(p, "/unset", 6) == 0) { + if (p[6] != ' ' && p[6] != '\t') + break; + r = process_global_unset(a, &global, p); + } else + break; + + if (r != ARCHIVE_OK) { + free_options(global); + return r; + } + } + + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Can't parse line %ju", counter); + free_options(global); + return (ARCHIVE_FATAL); +} + +/* + * Read in the entire mtree file into memory on the first request. + * Then use the next unused file to satisfy each header request. + */ +static int +read_header(struct archive_read *a, struct archive_entry *entry) +{ + struct mtree *mtree; + char *p; + int r, use_next; + + mtree = (struct mtree *)(a->format->data); + + if (mtree->fd >= 0) { + close(mtree->fd); + mtree->fd = -1; + } + + if (mtree->entries == NULL) { + mtree->resolver = archive_entry_linkresolver_new(); + if (mtree->resolver == NULL) + return ARCHIVE_FATAL; + archive_entry_linkresolver_set_strategy(mtree->resolver, + ARCHIVE_FORMAT_MTREE); + r = read_mtree(a, mtree); + if (r != ARCHIVE_OK) + return (r); + } + + a->archive.archive_format = mtree->archive_format; + a->archive.archive_format_name = mtree->archive_format_name; + + for (;;) { + if (mtree->this_entry == NULL) + return (ARCHIVE_EOF); + if (strcmp(mtree->this_entry->name, "..") == 0) { + mtree->this_entry->used = 1; + if (archive_strlen(&mtree->current_dir) > 0) { + /* Roll back current path. */ + p = mtree->current_dir.s + + mtree->current_dir.length - 1; + while (p >= mtree->current_dir.s && *p != '/') + --p; + if (p >= mtree->current_dir.s) + --p; + mtree->current_dir.length + = p - mtree->current_dir.s + 1; + } + } + if (!mtree->this_entry->used) { + use_next = 0; + r = parse_file(a, entry, mtree, mtree->this_entry, &use_next); + if (use_next == 0) + return (r); + } + mtree->this_entry = mtree->this_entry->next; + } +} + +/* + * A single file can have multiple lines contribute specifications. + * Parse as many lines as necessary, then pull additional information + * from a backing file on disk as necessary. + */ +static int +parse_file(struct archive_read *a, struct archive_entry *entry, + struct mtree *mtree, struct mtree_entry *mentry, int *use_next) +{ + const char *path; + struct stat st_storage, *st; + struct mtree_entry *mp; + struct archive_entry *sparse_entry; + int r = ARCHIVE_OK, r1, parsed_kws, mismatched_type; + + mentry->used = 1; + + /* Initialize reasonable defaults. */ + mtree->filetype = AE_IFREG; + archive_entry_set_size(entry, 0); + + /* Parse options from this line. */ + parsed_kws = 0; + r = parse_line(a, entry, mtree, mentry, &parsed_kws); + + if (mentry->full) { + archive_entry_copy_pathname(entry, mentry->name); + /* + * "Full" entries are allowed to have multiple lines + * and those lines aren't required to be adjacent. We + * don't support multiple lines for "relative" entries + * nor do we make any attempt to merge data from + * separate "relative" and "full" entries. (Merging + * "relative" and "full" entries would require dealing + * with pathname canonicalization, which is a very + * tricky subject.) + */ + for (mp = mentry->next; mp != NULL; mp = mp->next) { + if (mp->full && !mp->used + && strcmp(mentry->name, mp->name) == 0) { + /* Later lines override earlier ones. */ + mp->used = 1; + r1 = parse_line(a, entry, mtree, mp, + &parsed_kws); + if (r1 < r) + r = r1; + } + } + } else { + /* + * Relative entries require us to construct + * the full path and possibly update the + * current directory. + */ + size_t n = archive_strlen(&mtree->current_dir); + if (n > 0) + archive_strcat(&mtree->current_dir, "/"); + archive_strcat(&mtree->current_dir, mentry->name); + archive_entry_copy_pathname(entry, mtree->current_dir.s); + if (archive_entry_filetype(entry) != AE_IFDIR) + mtree->current_dir.length = n; + } + + /* + * Try to open and stat the file to get the real size + * and other file info. It would be nice to avoid + * this here so that getting a listing of an mtree + * wouldn't require opening every referenced contents + * file. But then we wouldn't know the actual + * contents size, so I don't see a really viable way + * around this. (Also, we may want to someday pull + * other unspecified info from the contents file on + * disk.) + */ + mtree->fd = -1; + if (archive_strlen(&mtree->contents_name) > 0) + path = mtree->contents_name.s; + else + path = archive_entry_pathname(entry); + + if (archive_entry_filetype(entry) == AE_IFREG || + archive_entry_filetype(entry) == AE_IFDIR) { + mtree->fd = open(path, O_RDONLY | O_BINARY); + if (mtree->fd == -1 && + (errno != ENOENT || + archive_strlen(&mtree->contents_name) > 0)) { + archive_set_error(&a->archive, errno, + "Can't open %s", path); + r = ARCHIVE_WARN; + } + } + + st = &st_storage; + if (mtree->fd >= 0) { + if (fstat(mtree->fd, st) == -1) { + archive_set_error(&a->archive, errno, + "Could not fstat %s", path); + r = ARCHIVE_WARN; + /* If we can't stat it, don't keep it open. */ + close(mtree->fd); + mtree->fd = -1; + st = NULL; + } + } else if (lstat(path, st) == -1) { + st = NULL; + } + + /* + * If there is a contents file on disk, use that size; + * otherwise leave it as-is (it might have been set from + * the mtree size= keyword). + */ + if (st != NULL) { + mismatched_type = 0; + if ((st->st_mode & S_IFMT) == S_IFREG && + archive_entry_filetype(entry) != AE_IFREG) + mismatched_type = 1; + if ((st->st_mode & S_IFMT) == S_IFLNK && + archive_entry_filetype(entry) != AE_IFLNK) + mismatched_type = 1; + if ((st->st_mode & S_IFSOCK) == S_IFSOCK && + archive_entry_filetype(entry) != AE_IFSOCK) + mismatched_type = 1; + if ((st->st_mode & S_IFMT) == S_IFCHR && + archive_entry_filetype(entry) != AE_IFCHR) + mismatched_type = 1; + if ((st->st_mode & S_IFMT) == S_IFBLK && + archive_entry_filetype(entry) != AE_IFBLK) + mismatched_type = 1; + if ((st->st_mode & S_IFMT) == S_IFDIR && + archive_entry_filetype(entry) != AE_IFDIR) + mismatched_type = 1; + if ((st->st_mode & S_IFMT) == S_IFIFO && + archive_entry_filetype(entry) != AE_IFIFO) + mismatched_type = 1; + + if (mismatched_type) { + if ((parsed_kws & MTREE_HAS_OPTIONAL) == 0) { + archive_set_error(&a->archive, + ARCHIVE_ERRNO_MISC, + "mtree specification has different type for %s", + archive_entry_pathname(entry)); + r = ARCHIVE_WARN; + } else { + *use_next = 1; + } + /* Don't hold a non-regular file open. */ + if (mtree->fd >= 0) + close(mtree->fd); + mtree->fd = -1; + st = NULL; + return r; + } + } + + if (st != NULL) { + if ((parsed_kws & MTREE_HAS_DEVICE) == 0 && + (archive_entry_filetype(entry) == AE_IFCHR || + archive_entry_filetype(entry) == AE_IFBLK)) + archive_entry_set_rdev(entry, st->st_rdev); + if ((parsed_kws & (MTREE_HAS_GID | MTREE_HAS_GNAME)) == 0) + archive_entry_set_gid(entry, st->st_gid); + if ((parsed_kws & (MTREE_HAS_UID | MTREE_HAS_UNAME)) == 0) + archive_entry_set_uid(entry, st->st_uid); + if ((parsed_kws & MTREE_HAS_MTIME) == 0) { +#if HAVE_STRUCT_STAT_ST_MTIMESPEC_TV_NSEC + archive_entry_set_mtime(entry, st->st_mtime, + st->st_mtimespec.tv_nsec); +#elif HAVE_STRUCT_STAT_ST_MTIM_TV_NSEC + archive_entry_set_mtime(entry, st->st_mtime, + st->st_mtim.tv_nsec); +#elif HAVE_STRUCT_STAT_ST_MTIME_N + archive_entry_set_mtime(entry, st->st_mtime, + st->st_mtime_n); +#elif HAVE_STRUCT_STAT_ST_UMTIME + archive_entry_set_mtime(entry, st->st_mtime, + st->st_umtime*1000); +#elif HAVE_STRUCT_STAT_ST_MTIME_USEC + archive_entry_set_mtime(entry, st->st_mtime, + st->st_mtime_usec*1000); +#else + archive_entry_set_mtime(entry, st->st_mtime, 0); +#endif + } + if ((parsed_kws & MTREE_HAS_NLINK) == 0) + archive_entry_set_nlink(entry, st->st_nlink); + if ((parsed_kws & MTREE_HAS_PERM) == 0) + archive_entry_set_perm(entry, st->st_mode); + if ((parsed_kws & MTREE_HAS_SIZE) == 0) + archive_entry_set_size(entry, st->st_size); + archive_entry_set_ino(entry, st->st_ino); + archive_entry_set_dev(entry, st->st_dev); + + archive_entry_linkify(mtree->resolver, &entry, &sparse_entry); + } else if (parsed_kws & MTREE_HAS_OPTIONAL) { + /* + * Couldn't open the entry, stat it or the on-disk type + * didn't match. If this entry is optional, just ignore it + * and read the next header entry. + */ + *use_next = 1; + return ARCHIVE_OK; + } + + mtree->cur_size = archive_entry_size(entry); + mtree->offset = 0; + + return r; +} + +/* + * Each line contains a sequence of keywords. + */ +static int +parse_line(struct archive_read *a, struct archive_entry *entry, + struct mtree *mtree, struct mtree_entry *mp, int *parsed_kws) +{ + struct mtree_option *iter; + int r = ARCHIVE_OK, r1; + + for (iter = mp->options; iter != NULL; iter = iter->next) { + r1 = parse_keyword(a, mtree, entry, iter, parsed_kws); + if (r1 < r) + r = r1; + } + if ((*parsed_kws & MTREE_HAS_TYPE) == 0) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Missing type keyword in mtree specification"); + return (ARCHIVE_WARN); + } + return (r); +} + +/* + * Device entries have one of the following forms: + * raw dev_t + * format,major,minor[,subdevice] + * + * Just use major and minor, no translation etc is done + * between formats. + */ +static int +parse_device(struct archive *a, struct archive_entry *entry, char *val) +{ + char *comma1, *comma2; + + comma1 = strchr(val, ','); + if (comma1 == NULL) { + archive_entry_set_dev(entry, mtree_atol10(&val)); + return (ARCHIVE_OK); + } + ++comma1; + comma2 = strchr(comma1, ','); + if (comma2 == NULL) { + archive_set_error(a, ARCHIVE_ERRNO_FILE_FORMAT, + "Malformed device attribute"); + return (ARCHIVE_WARN); + } + ++comma2; + archive_entry_set_rdevmajor(entry, mtree_atol(&comma1)); + archive_entry_set_rdevminor(entry, mtree_atol(&comma2)); + return (ARCHIVE_OK); +} + +/* + * Parse a single keyword and its value. + */ +static int +parse_keyword(struct archive_read *a, struct mtree *mtree, + struct archive_entry *entry, struct mtree_option *option, int *parsed_kws) +{ + char *val, *key; + + key = option->value; + + if (*key == '\0') + return (ARCHIVE_OK); + + if (strcmp(key, "optional") == 0) { + *parsed_kws |= MTREE_HAS_OPTIONAL; + return (ARCHIVE_OK); + } + if (strcmp(key, "ignore") == 0) { + /* + * The mtree processing is not recursive, so + * recursion will only happen for explicitly listed + * entries. + */ + return (ARCHIVE_OK); + } + + val = strchr(key, '='); + if (val == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Malformed attribute \"%s\" (%d)", key, key[0]); + return (ARCHIVE_WARN); + } + + *val = '\0'; + ++val; + + switch (key[0]) { + case 'c': + if (strcmp(key, "content") == 0 + || strcmp(key, "contents") == 0) { + parse_escapes(val, NULL); + archive_strcpy(&mtree->contents_name, val); + break; + } + if (strcmp(key, "cksum") == 0) + break; + case 'd': + if (strcmp(key, "device") == 0) { + *parsed_kws |= MTREE_HAS_DEVICE; + return parse_device(&a->archive, entry, val); + } + case 'f': + if (strcmp(key, "flags") == 0) { + *parsed_kws |= MTREE_HAS_FFLAGS; + archive_entry_copy_fflags_text(entry, val); + break; + } + case 'g': + if (strcmp(key, "gid") == 0) { + *parsed_kws |= MTREE_HAS_GID; + archive_entry_set_gid(entry, mtree_atol10(&val)); + break; + } + if (strcmp(key, "gname") == 0) { + *parsed_kws |= MTREE_HAS_GNAME; + archive_entry_copy_gname(entry, val); + break; + } + case 'l': + if (strcmp(key, "link") == 0) { + archive_entry_copy_symlink(entry, val); + break; + } + case 'm': + if (strcmp(key, "md5") == 0 || strcmp(key, "md5digest") == 0) + break; + if (strcmp(key, "mode") == 0) { + if (val[0] >= '0' && val[0] <= '9') { + *parsed_kws |= MTREE_HAS_PERM; + archive_entry_set_perm(entry, + mtree_atol8(&val)); + } else { + archive_set_error(&a->archive, + ARCHIVE_ERRNO_FILE_FORMAT, + "Symbolic mode \"%s\" unsupported", val); + return ARCHIVE_WARN; + } + break; + } + case 'n': + if (strcmp(key, "nlink") == 0) { + *parsed_kws |= MTREE_HAS_NLINK; + archive_entry_set_nlink(entry, mtree_atol10(&val)); + break; + } + case 'r': + if (strcmp(key, "rmd160") == 0 || + strcmp(key, "rmd160digest") == 0) + break; + case 's': + if (strcmp(key, "sha1") == 0 || strcmp(key, "sha1digest") == 0) + break; + if (strcmp(key, "sha256") == 0 || + strcmp(key, "sha256digest") == 0) + break; + if (strcmp(key, "sha384") == 0 || + strcmp(key, "sha384digest") == 0) + break; + if (strcmp(key, "sha512") == 0 || + strcmp(key, "sha512digest") == 0) + break; + if (strcmp(key, "size") == 0) { + archive_entry_set_size(entry, mtree_atol10(&val)); + break; + } + case 't': + if (strcmp(key, "tags") == 0) { + /* + * Comma delimited list of tags. + * Ignore the tags for now, but the interface + * should be extended to allow inclusion/exclusion. + */ + break; + } + if (strcmp(key, "time") == 0) { + time_t m; + long ns; + + *parsed_kws |= MTREE_HAS_MTIME; + m = (time_t)mtree_atol10(&val); + if (*val == '.') { + ++val; + ns = (long)mtree_atol10(&val); + } else + ns = 0; + archive_entry_set_mtime(entry, m, ns); + break; + } + if (strcmp(key, "type") == 0) { + *parsed_kws |= MTREE_HAS_TYPE; + switch (val[0]) { + case 'b': + if (strcmp(val, "block") == 0) { + mtree->filetype = AE_IFBLK; + break; + } + case 'c': + if (strcmp(val, "char") == 0) { + mtree->filetype = AE_IFCHR; + break; + } + case 'd': + if (strcmp(val, "dir") == 0) { + mtree->filetype = AE_IFDIR; + break; + } + case 'f': + if (strcmp(val, "fifo") == 0) { + mtree->filetype = AE_IFIFO; + break; + } + if (strcmp(val, "file") == 0) { + mtree->filetype = AE_IFREG; + break; + } + case 'l': + if (strcmp(val, "link") == 0) { + mtree->filetype = AE_IFLNK; + break; + } + default: + archive_set_error(&a->archive, + ARCHIVE_ERRNO_FILE_FORMAT, + "Unrecognized file type \"%s\"", val); + return (ARCHIVE_WARN); + } + archive_entry_set_filetype(entry, mtree->filetype); + break; + } + case 'u': + if (strcmp(key, "uid") == 0) { + *parsed_kws |= MTREE_HAS_UID; + archive_entry_set_uid(entry, mtree_atol10(&val)); + break; + } + if (strcmp(key, "uname") == 0) { + *parsed_kws |= MTREE_HAS_UNAME; + archive_entry_copy_uname(entry, val); + break; + } + default: + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Unrecognized key %s=%s", key, val); + return (ARCHIVE_WARN); + } + return (ARCHIVE_OK); +} + +static int +read_data(struct archive_read *a, const void **buff, size_t *size, off_t *offset) +{ + size_t bytes_to_read; + ssize_t bytes_read; + struct mtree *mtree; + + mtree = (struct mtree *)(a->format->data); + if (mtree->fd < 0) { + *buff = NULL; + *offset = 0; + *size = 0; + return (ARCHIVE_EOF); + } + if (mtree->buff == NULL) { + mtree->buffsize = 64 * 1024; + mtree->buff = malloc(mtree->buffsize); + if (mtree->buff == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate memory"); + return (ARCHIVE_FATAL); + } + } + + *buff = mtree->buff; + *offset = mtree->offset; + if ((off_t)mtree->buffsize > mtree->cur_size - mtree->offset) + bytes_to_read = mtree->cur_size - mtree->offset; + else + bytes_to_read = mtree->buffsize; + bytes_read = read(mtree->fd, mtree->buff, bytes_to_read); + if (bytes_read < 0) { + archive_set_error(&a->archive, errno, "Can't read"); + return (ARCHIVE_WARN); + } + if (bytes_read == 0) { + *size = 0; + return (ARCHIVE_EOF); + } + mtree->offset += bytes_read; + *size = bytes_read; + return (ARCHIVE_OK); +} + +/* Skip does nothing except possibly close the contents file. */ +static int +skip(struct archive_read *a) +{ + struct mtree *mtree; + + mtree = (struct mtree *)(a->format->data); + if (mtree->fd >= 0) { + close(mtree->fd); + mtree->fd = -1; + } + return (ARCHIVE_OK); +} + +/* + * Since parsing backslash sequences always makes strings shorter, + * we can always do this conversion in-place. + */ +static void +parse_escapes(char *src, struct mtree_entry *mentry) +{ + char *dest = src; + char c; + + if (mentry != NULL && strcmp(src, ".") == 0) + mentry->full = 1; + + while (*src != '\0') { + c = *src++; + if (c == '/' && mentry != NULL) + mentry->full = 1; + if (c == '\\') { + switch (src[0]) { + case '0': + if (src[1] < '0' || src[1] > '7') { + c = 0; + ++src; + break; + } + /* FALLTHROUGH */ + case '1': + case '2': + case '3': + if (src[1] >= '0' && src[1] <= '7' && + src[2] >= '0' && src[2] <= '7') { + c = (src[0] - '0') << 6; + c |= (src[1] - '0') << 3; + c |= (src[2] - '0'); + src += 3; + } + break; + case 'a': + c = '\a'; + ++src; + break; + case 'b': + c = '\b'; + ++src; + break; + case 'f': + c = '\f'; + ++src; + break; + case 'n': + c = '\n'; + ++src; + break; + case 'r': + c = '\r'; + ++src; + break; + case 's': + c = ' '; + ++src; + break; + case 't': + c = '\t'; + ++src; + break; + case 'v': + c = '\v'; + ++src; + break; + } + } + *dest++ = c; + } + *dest = '\0'; +} + +/* + * Note that this implementation does not (and should not!) obey + * locale settings; you cannot simply substitute strtol here, since + * it does obey locale. + */ +#ifndef __minix +static int64_t +mtree_atol8(char **p) +{ + int64_t l, limit, last_digit_limit; + int digit, base; + + base = 8; + limit = INT64_MAX / base; + last_digit_limit = INT64_MAX % base; + + l = 0; + digit = **p - '0'; + while (digit >= 0 && digit < base) { + if (l>limit || (l == limit && digit > last_digit_limit)) { + l = INT64_MAX; /* Truncate on overflow. */ + break; + } + l = (l * base) + digit; + digit = *++(*p) - '0'; + } + return (l); +} +#else +static int32_t +mtree_atol8(char **p) +{ + int32_t l, limit, last_digit_limit; + int digit, base; + + base = 8; + limit = INT32_MAX / base; + last_digit_limit = INT32_MAX % base; + + l = 0; + digit = **p - '0'; + while (digit >= 0 && digit < base) { + if (l>limit || (l == limit && digit > last_digit_limit)) { + l = INT32_MAX; /* Truncate on overflow. */ + break; + } + l = (l * base) + digit; + digit = *++(*p) - '0'; + } + return (l); +} +#endif + +/* + * Note that this implementation does not (and should not!) obey + * locale settings; you cannot simply substitute strtol here, since + * it does obey locale. + */ +#ifndef __minix +static int64_t +mtree_atol10(char **p) +{ + int64_t l, limit, last_digit_limit; + int base, digit, sign; + + base = 10; + limit = INT64_MAX / base; + last_digit_limit = INT64_MAX % base; + + if (**p == '-') { + sign = -1; + ++(*p); + } else + sign = 1; + + l = 0; + digit = **p - '0'; + while (digit >= 0 && digit < base) { + if (l > limit || (l == limit && digit > last_digit_limit)) { + l = INT64_MAX; /* Truncate on overflow. */ + break; + } + l = (l * base) + digit; + digit = *++(*p) - '0'; + } + return (sign < 0) ? -l : l; +} +#else +static int32_t +mtree_atol10(char **p) +{ + int32_t l, limit, last_digit_limit; + int base, digit, sign; + + base = 10; + limit = INT32_MAX / base; + last_digit_limit = INT32_MAX % base; + + if (**p == '-') { + sign = -1; + ++(*p); + } else + sign = 1; + + l = 0; + digit = **p - '0'; + while (digit >= 0 && digit < base) { + if (l > limit || (l == limit && digit > last_digit_limit)) { + l = INT32_MAX; /* Truncate on overflow. */ + break; + } + l = (l * base) + digit; + digit = *++(*p) - '0'; + } + return (sign < 0) ? -l : l; +} +#endif +/* + * Note that this implementation does not (and should not!) obey + * locale settings; you cannot simply substitute strtol here, since + * it does obey locale. + */ +#ifndef __minix +static int64_t +mtree_atol16(char **p) +{ + int64_t l, limit, last_digit_limit; + int base, digit, sign; + + base = 16; + limit = INT64_MAX / base; + last_digit_limit = INT64_MAX % base; + + if (**p == '-') { + sign = -1; + ++(*p); + } else + sign = 1; + + l = 0; + if (**p >= '0' && **p <= '9') + digit = **p - '0'; + else if (**p >= 'a' && **p <= 'f') + digit = **p - 'a' + 10; + else if (**p >= 'A' && **p <= 'F') + digit = **p - 'A' + 10; + else + digit = -1; + while (digit >= 0 && digit < base) { + if (l > limit || (l == limit && digit > last_digit_limit)) { + l = INT64_MAX; /* Truncate on overflow. */ + break; + } + l = (l * base) + digit; + if (**p >= '0' && **p <= '9') + digit = **p - '0'; + else if (**p >= 'a' && **p <= 'f') + digit = **p - 'a' + 10; + else if (**p >= 'A' && **p <= 'F') + digit = **p - 'A' + 10; + else + digit = -1; + } + return (sign < 0) ? -l : l; +} +#else +static int32_t +mtree_atol16(char **p) +{ + int32_t l, limit, last_digit_limit; + int base, digit, sign; + + base = 16; + limit = INT32_MAX / base; + last_digit_limit = INT32_MAX % base; + + if (**p == '-') { + sign = -1; + ++(*p); + } else + sign = 1; + + l = 0; + if (**p >= '0' && **p <= '9') + digit = **p - '0'; + else if (**p >= 'a' && **p <= 'f') + digit = **p - 'a' + 10; + else if (**p >= 'A' && **p <= 'F') + digit = **p - 'A' + 10; + else + digit = -1; + while (digit >= 0 && digit < base) { + if (l > limit || (l == limit && digit > last_digit_limit)) { + l = INT32_MAX; /* Truncate on overflow. */ + break; + } + l = (l * base) + digit; + if (**p >= '0' && **p <= '9') + digit = **p - '0'; + else if (**p >= 'a' && **p <= 'f') + digit = **p - 'a' + 10; + else if (**p >= 'A' && **p <= 'F') + digit = **p - 'A' + 10; + else + digit = -1; + } + return (sign < 0) ? -l : l; +} +#endif + +#ifndef __minix +static int64_t +mtree_atol(char **p) +{ + if (**p != '0') + return mtree_atol10(p); + if ((*p)[1] == 'x' || (*p)[1] == 'X') { + *p += 2; + return mtree_atol16(p); + } + return mtree_atol8(p); +} +#else +static int32_t +mtree_atol(char **p) +{ + if (**p != '0') + return mtree_atol10(p); + if ((*p)[1] == 'x' || (*p)[1] == 'X') { + *p += 2; + return mtree_atol16(p); + } + return mtree_atol8(p); +} +#endif +/* + * Returns length of line (including trailing newline) + * or negative on error. 'start' argument is updated to + * point to first character of line. + */ +static ssize_t +readline(struct archive_read *a, struct mtree *mtree, char **start, ssize_t limit) +{ + ssize_t bytes_read; + ssize_t total_size = 0; + ssize_t find_off = 0; + const void *t; + const char *s; + void *p; + char *u; + + /* Accumulate line in a line buffer. */ + for (;;) { + /* Read some more. */ + t = __archive_read_ahead(a, 1, &bytes_read); + if (t == NULL) + return (0); + if (bytes_read < 0) + return (ARCHIVE_FATAL); + s = t; /* Start of line? */ + p = memchr(t, '\n', bytes_read); + /* If we found '\n', trim the read. */ + if (p != NULL) { + bytes_read = 1 + ((const char *)p) - s; + } + if (total_size + bytes_read + 1 > limit) { + archive_set_error(&a->archive, + ARCHIVE_ERRNO_FILE_FORMAT, + "Line too long"); + return (ARCHIVE_FATAL); + } + if (archive_string_ensure(&mtree->line, + total_size + bytes_read + 1) == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate working buffer"); + return (ARCHIVE_FATAL); + } + memcpy(mtree->line.s + total_size, t, bytes_read); + __archive_read_consume(a, bytes_read); + total_size += bytes_read; + /* Null terminate. */ + mtree->line.s[total_size] = '\0'; + /* If we found an unescaped '\n', clean up and return. */ + for (u = mtree->line.s + find_off; *u; ++u) { + if (u[0] == '\n') { + *start = mtree->line.s; + return total_size; + } + if (u[0] == '#') { + if (p == NULL) + break; + *start = mtree->line.s; + return total_size; + } + if (u[0] != '\\') + continue; + if (u[1] == '\\') { + ++u; + continue; + } + if (u[1] == '\n') { + memmove(u, u + 1, + total_size - (u - mtree->line.s) + 1); + --total_size; + ++u; + break; + } + if (u[1] == '\0') + break; + } + find_off = u - mtree->line.s; + } +} diff --git a/lib/libarchive/archive_read_support_format_raw.c b/lib/libarchive/archive_read_support_format_raw.c new file mode 100644 index 000000000..818b64c9e --- /dev/null +++ b/lib/libarchive/archive_read_support_format_raw.c @@ -0,0 +1,193 @@ +/*- + * Copyright (c) 2003-2009 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_support_format_raw.c 201107 2009-12-28 03:25:33Z kientzle $"); + +#ifdef HAVE_ERRNO_H +#include +#endif +#include +#ifdef HAVE_STDLIB_H +#include +#endif + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" +#include "archive_read_private.h" + +struct raw_info { +#ifndef __minix + int64_t offset; /* Current position in the file. */ +#else + off_t offset; +#endif + int end_of_file; +}; + +static int archive_read_format_raw_bid(struct archive_read *); +static int archive_read_format_raw_cleanup(struct archive_read *); +static int archive_read_format_raw_read_data(struct archive_read *, + const void **, size_t *, off_t *); +static int archive_read_format_raw_read_data_skip(struct archive_read *); +static int archive_read_format_raw_read_header(struct archive_read *, + struct archive_entry *); + +int +archive_read_support_format_raw(struct archive *_a) +{ + struct raw_info *info; + struct archive_read *a = (struct archive_read *)_a; + int r; + + info = (struct raw_info *)calloc(1, sizeof(*info)); + if (info == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate raw_info data"); + return (ARCHIVE_FATAL); + } + + r = __archive_read_register_format(a, + info, + "raw", + archive_read_format_raw_bid, + NULL, + archive_read_format_raw_read_header, + archive_read_format_raw_read_data, + archive_read_format_raw_read_data_skip, + archive_read_format_raw_cleanup); + if (r != ARCHIVE_OK) + free(info); + return (r); +} + +/* + * Bid 1 if this is a non-empty file. Anyone who can really support + * this should outbid us, so it should generally be safe to use "raw" + * in conjunction with other formats. But, this could really confuse + * folks if there are bid errors or minor file damage, so we don't + * include "raw" as part of support_format_all(). + */ +static int +archive_read_format_raw_bid(struct archive_read *a) +{ + + if (__archive_read_ahead(a, 1, NULL) == NULL) + return (-1); + return (1); +} + +/* + * Mock up a fake header. + */ +static int +archive_read_format_raw_read_header(struct archive_read *a, + struct archive_entry *entry) +{ + struct raw_info *info; + + info = (struct raw_info *)(a->format->data); + if (info->end_of_file) + return (ARCHIVE_EOF); + + a->archive.archive_format = ARCHIVE_FORMAT_RAW; + a->archive.archive_format_name = "Raw data"; + archive_entry_set_pathname(entry, "data"); + /* XXX should we set mode to mimic a regular file? XXX */ + /* I'm deliberately leaving most fields unset here. */ + return (ARCHIVE_OK); +} + +static int +archive_read_format_raw_read_data(struct archive_read *a, + const void **buff, size_t *size, off_t *offset) +{ + struct raw_info *info; + ssize_t avail; + + info = (struct raw_info *)(a->format->data); + if (info->end_of_file) + return (ARCHIVE_EOF); + + /* Get whatever bytes are immediately available. */ + *buff = __archive_read_ahead(a, 1, &avail); + if (avail > 0) { + /* Consume and return the bytes we just read */ + __archive_read_consume(a, avail); + *size = avail; + *offset = info->offset; + info->offset += *size; + return (ARCHIVE_OK); + } else if (0 == avail) { + /* Record and return end-of-file. */ + info->end_of_file = 1; + *size = 0; + *offset = info->offset; + return (ARCHIVE_EOF); + } else { + /* Record and return an error. */ + *size = 0; + *offset = info->offset; + return (avail); + } +} + +static int +archive_read_format_raw_read_data_skip(struct archive_read *a) +{ + struct raw_info *info; + off_t bytes_skipped; +#ifndef __minix + int64_t request = 1024 * 1024 * 1024UL; /* Skip 1 GB at a time. */ +#else + int32_t request = 1024 * 1024 * 1024UL; /* Skip 1 GB at a time. */ +#endif + + info = (struct raw_info *)(a->format->data); + if (info->end_of_file) + return (ARCHIVE_EOF); + info->end_of_file = 1; + + for (;;) { + bytes_skipped = __archive_read_skip_lenient(a, request); + if (bytes_skipped < 0) + return (ARCHIVE_FATAL); + if (bytes_skipped < request) + return (ARCHIVE_OK); + /* We skipped all the bytes we asked for. There might + * be more, so try again. */ + } +} + +static int +archive_read_format_raw_cleanup(struct archive_read *a) +{ + struct raw_info *info; + + info = (struct raw_info *)(a->format->data); + free(info); + a->format->data = NULL; + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_read_support_format_tar.c b/lib/libarchive/archive_read_support_format_tar.c new file mode 100644 index 000000000..e34b4e736 --- /dev/null +++ b/lib/libarchive/archive_read_support_format_tar.c @@ -0,0 +1,2651 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_support_format_tar.c 201161 2009-12-29 05:44:39Z kientzle $"); + +#ifdef HAVE_ERRNO_H +#include +#endif +#include +/* #include */ /* See archive_platform.h */ +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif + +/* Obtain suitable wide-character manipulation functions. */ +#ifdef HAVE_WCHAR_H +#include +#else +/* Good enough for equality testing, which is all we need. */ +static int wcscmp(const wchar_t *s1, const wchar_t *s2) +{ + int diff = *s1 - *s2; + while (*s1 && diff == 0) + diff = (int)*++s1 - (int)*++s2; + return diff; +} +/* Good enough for equality testing, which is all we need. */ +static int wcsncmp(const wchar_t *s1, const wchar_t *s2, size_t n) +{ + int diff = *s1 - *s2; + while (*s1 && diff == 0 && n-- > 0) + diff = (int)*++s1 - (int)*++s2; + return diff; +} +static size_t wcslen(const wchar_t *s) +{ + const wchar_t *p = s; + while (*p) + p++; + return p - s; +} +#endif + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" +#include "archive_read_private.h" + +#define tar_min(a,b) ((a) < (b) ? (a) : (b)) + +/* + * Layout of POSIX 'ustar' tar header. + */ +struct archive_entry_header_ustar { + char name[100]; + char mode[8]; + char uid[8]; + char gid[8]; + char size[12]; + char mtime[12]; + char checksum[8]; + char typeflag[1]; + char linkname[100]; /* "old format" header ends here */ + char magic[6]; /* For POSIX: "ustar\0" */ + char version[2]; /* For POSIX: "00" */ + char uname[32]; + char gname[32]; + char rdevmajor[8]; + char rdevminor[8]; + char prefix[155]; +}; + +/* + * Structure of GNU tar header + */ +struct gnu_sparse { + char offset[12]; + char numbytes[12]; +}; + +struct archive_entry_header_gnutar { + char name[100]; + char mode[8]; + char uid[8]; + char gid[8]; + char size[12]; + char mtime[12]; + char checksum[8]; + char typeflag[1]; + char linkname[100]; + char magic[8]; /* "ustar \0" (note blank/blank/null at end) */ + char uname[32]; + char gname[32]; + char rdevmajor[8]; + char rdevminor[8]; + char atime[12]; + char ctime[12]; + char offset[12]; + char longnames[4]; + char unused[1]; + struct gnu_sparse sparse[4]; + char isextended[1]; + char realsize[12]; + /* + * Old GNU format doesn't use POSIX 'prefix' field; they use + * the 'L' (longname) entry instead. + */ +}; + +/* + * Data specific to this format. + */ +struct sparse_block { + struct sparse_block *next; + off_t offset; + off_t remaining; +}; + +struct tar { + struct archive_string acl_text; + struct archive_string entry_pathname; + /* For "GNU.sparse.name" and other similar path extensions. */ + struct archive_string entry_pathname_override; + struct archive_string entry_linkpath; + struct archive_string entry_uname; + struct archive_string entry_gname; + struct archive_string longlink; + struct archive_string longname; + struct archive_string pax_header; + struct archive_string pax_global; + struct archive_string line; + int pax_hdrcharset_binary; + wchar_t *pax_entry; + size_t pax_entry_length; + int header_recursion_depth; +#ifndef __minix + int64_t entry_bytes_remaining; + int64_t entry_offset; + int64_t entry_padding; + int64_t realsize; +#else + size_t entry_bytes_remaining; + off_t entry_offset; + off_t entry_padding; + size_t realsize; +#endif + struct sparse_block *sparse_list; + struct sparse_block *sparse_last; +#ifndef __minix + int64_t sparse_offset; + int64_t sparse_numbytes; +#else + off_t sparse_offset; + size_t sparse_numbytes; +#endif + int sparse_gnu_major; + int sparse_gnu_minor; + char sparse_gnu_pending; +}; + +static ssize_t UTF8_mbrtowc(wchar_t *pwc, const char *s, size_t n); +static int archive_block_is_null(const unsigned char *p); +static char *base64_decode(const char *, size_t, size_t *); +static void gnu_add_sparse_entry(struct tar *, + off_t offset, off_t remaining); +static void gnu_clear_sparse_list(struct tar *); +static int gnu_sparse_old_read(struct archive_read *, struct tar *, + const struct archive_entry_header_gnutar *header); +static void gnu_sparse_old_parse(struct tar *, + const struct gnu_sparse *sparse, int length); +static int gnu_sparse_01_parse(struct tar *, const char *); +static ssize_t gnu_sparse_10_read(struct archive_read *, struct tar *); +static int header_Solaris_ACL(struct archive_read *, struct tar *, + struct archive_entry *, const void *); +static int header_common(struct archive_read *, struct tar *, + struct archive_entry *, const void *); +static int header_old_tar(struct archive_read *, struct tar *, + struct archive_entry *, const void *); +static int header_pax_extensions(struct archive_read *, struct tar *, + struct archive_entry *, const void *); +static int header_pax_global(struct archive_read *, struct tar *, + struct archive_entry *, const void *h); +static int header_longlink(struct archive_read *, struct tar *, + struct archive_entry *, const void *h); +static int header_longname(struct archive_read *, struct tar *, + struct archive_entry *, const void *h); +static int header_volume(struct archive_read *, struct tar *, + struct archive_entry *, const void *h); +static int header_ustar(struct archive_read *, struct tar *, + struct archive_entry *, const void *h); +static int header_gnutar(struct archive_read *, struct tar *, + struct archive_entry *, const void *h); +static int archive_read_format_tar_bid(struct archive_read *); +static int archive_read_format_tar_cleanup(struct archive_read *); +static int archive_read_format_tar_read_data(struct archive_read *a, + const void **buff, size_t *size, off_t *offset); +static int archive_read_format_tar_skip(struct archive_read *a); +static int archive_read_format_tar_read_header(struct archive_read *, + struct archive_entry *); +static int checksum(struct archive_read *, const void *); +static int pax_attribute(struct tar *, struct archive_entry *, + char *key, char *value); +static int pax_header(struct archive_read *, struct tar *, + struct archive_entry *, char *attr); +#ifndef __minix +static void pax_time(const char *, int64_t *sec, long *nanos); +#else +static void pax_time(const char *, time_t *sec, long *nanos); +#endif +static ssize_t readline(struct archive_read *, struct tar *, const char **, + ssize_t limit); +static int read_body_to_string(struct archive_read *, struct tar *, + struct archive_string *, const void *h); +#ifndef __minix +static int64_t tar_atol(const char *, unsigned); +static int64_t tar_atol10(const char *, unsigned); +static int64_t tar_atol256(const char *, unsigned); +static int64_t tar_atol8(const char *, unsigned); +#else +static int32_t tar_atol(const char *, unsigned); +static int32_t tar_atol10(const char *, unsigned); +static int32_t tar_atol256(const char *, unsigned); +static int32_t tar_atol8(const char *, unsigned); +#endif +static int tar_read_header(struct archive_read *, struct tar *, + struct archive_entry *); +static int tohex(int c); +static char *url_decode(const char *); +static wchar_t *utf8_decode(struct tar *, const char *, size_t length); + +int +archive_read_support_format_gnutar(struct archive *a) +{ + return (archive_read_support_format_tar(a)); +} + + +int +archive_read_support_format_tar(struct archive *_a) +{ + struct archive_read *a = (struct archive_read *)_a; + struct tar *tar; + int r; + + tar = (struct tar *)malloc(sizeof(*tar)); + if (tar == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate tar data"); + return (ARCHIVE_FATAL); + } + memset(tar, 0, sizeof(*tar)); + + r = __archive_read_register_format(a, tar, "tar", + archive_read_format_tar_bid, + NULL, + archive_read_format_tar_read_header, + archive_read_format_tar_read_data, + archive_read_format_tar_skip, + archive_read_format_tar_cleanup); + + if (r != ARCHIVE_OK) + free(tar); + return (ARCHIVE_OK); +} + +static int +archive_read_format_tar_cleanup(struct archive_read *a) +{ + struct tar *tar; + + tar = (struct tar *)(a->format->data); + gnu_clear_sparse_list(tar); + archive_string_free(&tar->acl_text); + archive_string_free(&tar->entry_pathname); + archive_string_free(&tar->entry_pathname_override); + archive_string_free(&tar->entry_linkpath); + archive_string_free(&tar->entry_uname); + archive_string_free(&tar->entry_gname); + archive_string_free(&tar->line); + archive_string_free(&tar->pax_global); + archive_string_free(&tar->pax_header); + archive_string_free(&tar->longname); + archive_string_free(&tar->longlink); + free(tar->pax_entry); + free(tar); + (a->format->data) = NULL; + return (ARCHIVE_OK); +} + + +static int +archive_read_format_tar_bid(struct archive_read *a) +{ + int bid; + const void *h; + const struct archive_entry_header_ustar *header; + + bid = 0; + + /* Now let's look at the actual header and see if it matches. */ + h = __archive_read_ahead(a, 512, NULL); + if (h == NULL) + return (-1); + + /* If it's an end-of-archive mark, we can handle it. */ + if ((*(const char *)h) == 0 + && archive_block_is_null((const unsigned char *)h)) { + /* + * Usually, I bid the number of bits verified, but + * in this case, 4096 seems excessive so I picked 10 as + * an arbitrary but reasonable-seeming value. + */ + return (10); + } + + /* If it's not an end-of-archive mark, it must have a valid checksum.*/ + if (!checksum(a, h)) + return (0); + bid += 48; /* Checksum is usually 6 octal digits. */ + + header = (const struct archive_entry_header_ustar *)h; + + /* Recognize POSIX formats. */ + if ((memcmp(header->magic, "ustar\0", 6) == 0) + &&(memcmp(header->version, "00", 2)==0)) + bid += 56; + + /* Recognize GNU tar format. */ + if ((memcmp(header->magic, "ustar ", 6) == 0) + &&(memcmp(header->version, " \0", 2)==0)) + bid += 56; + + /* Type flag must be null, digit or A-Z, a-z. */ + if (header->typeflag[0] != 0 && + !( header->typeflag[0] >= '0' && header->typeflag[0] <= '9') && + !( header->typeflag[0] >= 'A' && header->typeflag[0] <= 'Z') && + !( header->typeflag[0] >= 'a' && header->typeflag[0] <= 'z') ) + return (0); + bid += 2; /* 6 bits of variation in an 8-bit field leaves 2 bits. */ + + /* Sanity check: Look at first byte of mode field. */ + switch (255 & (unsigned)header->mode[0]) { + case 0: case 255: + /* Base-256 value: No further verification possible! */ + break; + case ' ': /* Not recommended, but not illegal, either. */ + break; + case '0': case '1': case '2': case '3': + case '4': case '5': case '6': case '7': + /* Octal Value. */ + /* TODO: Check format of remainder of this field. */ + break; + default: + /* Not a valid mode; bail out here. */ + return (0); + } + /* TODO: Sanity test uid/gid/size/mtime/rdevmajor/rdevminor fields. */ + + return (bid); +} + +/* + * The function invoked by archive_read_header(). This + * just sets up a few things and then calls the internal + * tar_read_header() function below. + */ +static int +archive_read_format_tar_read_header(struct archive_read *a, + struct archive_entry *entry) +{ + /* + * When converting tar archives to cpio archives, it is + * essential that each distinct file have a distinct inode + * number. To simplify this, we keep a static count here to + * assign fake dev/inode numbers to each tar entry. Note that + * pax format archives may overwrite this with something more + * useful. + * + * Ideally, we would track every file read from the archive so + * that we could assign the same dev/ino pair to hardlinks, + * but the memory required to store a complete lookup table is + * probably not worthwhile just to support the relatively + * obscure tar->cpio conversion case. + */ + static int default_inode; + static int default_dev; + struct tar *tar; + struct sparse_block *sp; + const char *p; + int r; + size_t l; + + /* Assign default device/inode values. */ + archive_entry_set_dev(entry, 1 + default_dev); /* Don't use zero. */ + archive_entry_set_ino(entry, ++default_inode); /* Don't use zero. */ + /* Limit generated st_ino number to 16 bits. */ + if (default_inode >= 0xffff) { + ++default_dev; + default_inode = 0; + } + + tar = (struct tar *)(a->format->data); + tar->entry_offset = 0; + while (tar->sparse_list != NULL) { + sp = tar->sparse_list; + tar->sparse_list = sp->next; + free(sp); + } + tar->sparse_last = NULL; + tar->realsize = -1; /* Mark this as "unset" */ + + r = tar_read_header(a, tar, entry); + + /* + * "non-sparse" files are really just sparse files with + * a single block. + */ + if (tar->sparse_list == NULL) + gnu_add_sparse_entry(tar, 0, tar->entry_bytes_remaining); + + if (r == ARCHIVE_OK) { + /* + * "Regular" entry with trailing '/' is really + * directory: This is needed for certain old tar + * variants and even for some broken newer ones. + */ + p = archive_entry_pathname(entry); + l = strlen(p); + if (archive_entry_filetype(entry) == AE_IFREG + && p[l-1] == '/') + archive_entry_set_filetype(entry, AE_IFDIR); + } + return (r); +} + +static int +archive_read_format_tar_read_data(struct archive_read *a, + const void **buff, size_t *size, off_t *offset) +{ + ssize_t bytes_read; + struct tar *tar; + struct sparse_block *p; + + tar = (struct tar *)(a->format->data); + + if (tar->sparse_gnu_pending) { + if (tar->sparse_gnu_major == 1 && tar->sparse_gnu_minor == 0) { + tar->sparse_gnu_pending = 0; + /* Read initial sparse map. */ + bytes_read = gnu_sparse_10_read(a, tar); + tar->entry_bytes_remaining -= bytes_read; + if (bytes_read < 0) + return (bytes_read); + } else { + *size = 0; + *offset = 0; + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Unrecognized GNU sparse file format"); + return (ARCHIVE_WARN); + } + tar->sparse_gnu_pending = 0; + } + + /* Remove exhausted entries from sparse list. */ + while (tar->sparse_list != NULL && + tar->sparse_list->remaining == 0) { + p = tar->sparse_list; + tar->sparse_list = p->next; + free(p); + } + + /* If we're at end of file, return EOF. */ + if (tar->sparse_list == NULL || tar->entry_bytes_remaining == 0) { + if (__archive_read_skip(a, tar->entry_padding) < 0) + return (ARCHIVE_FATAL); + tar->entry_padding = 0; + *buff = NULL; + *size = 0; + *offset = tar->realsize; + return (ARCHIVE_EOF); + } + + *buff = __archive_read_ahead(a, 1, &bytes_read); + if (bytes_read < 0) + return (ARCHIVE_FATAL); + if (*buff == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Truncated tar archive"); + return (ARCHIVE_FATAL); + } + if (bytes_read > tar->entry_bytes_remaining) + bytes_read = tar->entry_bytes_remaining; + /* Don't read more than is available in the + * current sparse block. */ + if (tar->sparse_list->remaining < bytes_read) + bytes_read = tar->sparse_list->remaining; + *size = bytes_read; + *offset = tar->sparse_list->offset; + tar->sparse_list->remaining -= bytes_read; + tar->sparse_list->offset += bytes_read; + tar->entry_bytes_remaining -= bytes_read; + __archive_read_consume(a, bytes_read); + return (ARCHIVE_OK); +} + +static int +archive_read_format_tar_skip(struct archive_read *a) +{ +#ifndef __minix + int64_t bytes_skipped; +#else + size_t bytes_skipped; +#endif + struct tar* tar; + + tar = (struct tar *)(a->format->data); + + /* + * Compression layer skip functions are required to either skip the + * length requested or fail, so we can rely upon the entire entry + * plus padding being skipped. + */ + bytes_skipped = __archive_read_skip(a, + tar->entry_bytes_remaining + tar->entry_padding); + if (bytes_skipped < 0) + return (ARCHIVE_FATAL); + + tar->entry_bytes_remaining = 0; + tar->entry_padding = 0; + + /* Free the sparse list. */ + gnu_clear_sparse_list(tar); + + return (ARCHIVE_OK); +} + +/* + * This function recursively interprets all of the headers associated + * with a single entry. + */ +static int +tar_read_header(struct archive_read *a, struct tar *tar, + struct archive_entry *entry) +{ + ssize_t bytes; + int err; + const void *h; + const struct archive_entry_header_ustar *header; + + /* Read 512-byte header record */ + h = __archive_read_ahead(a, 512, &bytes); + if (bytes < 0) + return (bytes); + if (bytes < 512) { /* Short read or EOF. */ + /* Try requesting just one byte and see what happens. */ + (void)__archive_read_ahead(a, 1, &bytes); + if (bytes == 0) { + /* + * The archive ends at a 512-byte boundary but + * without a proper end-of-archive marker. + * Yes, there are tar writers that do this; + * hold our nose and accept it. + */ + return (ARCHIVE_EOF); + } + /* Archive ends with a partial block; this is bad. */ + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Truncated tar archive"); + return (ARCHIVE_FATAL); + } + __archive_read_consume(a, 512); + + + /* Check for end-of-archive mark. */ + if (((*(const char *)h)==0) && archive_block_is_null((const unsigned char *)h)) { + /* Try to consume a second all-null record, as well. */ + h = __archive_read_ahead(a, 512, NULL); + if (h != NULL) + __archive_read_consume(a, 512); + archive_set_error(&a->archive, 0, NULL); + if (a->archive.archive_format_name == NULL) { + a->archive.archive_format = ARCHIVE_FORMAT_TAR; + a->archive.archive_format_name = "tar"; + } + return (ARCHIVE_EOF); + } + + /* + * Note: If the checksum fails and we return ARCHIVE_RETRY, + * then the client is likely to just retry. This is a very + * crude way to search for the next valid header! + * + * TODO: Improve this by implementing a real header scan. + */ + if (!checksum(a, h)) { + archive_set_error(&a->archive, EINVAL, "Damaged tar archive"); + return (ARCHIVE_RETRY); /* Retryable: Invalid header */ + } + + if (++tar->header_recursion_depth > 32) { + archive_set_error(&a->archive, EINVAL, "Too many special headers"); + return (ARCHIVE_WARN); + } + + /* Determine the format variant. */ + header = (const struct archive_entry_header_ustar *)h; + switch(header->typeflag[0]) { + case 'A': /* Solaris tar ACL */ + a->archive.archive_format = ARCHIVE_FORMAT_TAR_PAX_INTERCHANGE; + a->archive.archive_format_name = "Solaris tar"; + err = header_Solaris_ACL(a, tar, entry, h); + break; + case 'g': /* POSIX-standard 'g' header. */ + a->archive.archive_format = ARCHIVE_FORMAT_TAR_PAX_INTERCHANGE; + a->archive.archive_format_name = "POSIX pax interchange format"; + err = header_pax_global(a, tar, entry, h); + break; + case 'K': /* Long link name (GNU tar, others) */ + err = header_longlink(a, tar, entry, h); + break; + case 'L': /* Long filename (GNU tar, others) */ + err = header_longname(a, tar, entry, h); + break; + case 'V': /* GNU volume header */ + err = header_volume(a, tar, entry, h); + break; + case 'X': /* Used by SUN tar; same as 'x'. */ + a->archive.archive_format = ARCHIVE_FORMAT_TAR_PAX_INTERCHANGE; + a->archive.archive_format_name = + "POSIX pax interchange format (Sun variant)"; + err = header_pax_extensions(a, tar, entry, h); + break; + case 'x': /* POSIX-standard 'x' header. */ + a->archive.archive_format = ARCHIVE_FORMAT_TAR_PAX_INTERCHANGE; + a->archive.archive_format_name = "POSIX pax interchange format"; + err = header_pax_extensions(a, tar, entry, h); + break; + default: + if (memcmp(header->magic, "ustar \0", 8) == 0) { + a->archive.archive_format = ARCHIVE_FORMAT_TAR_GNUTAR; + a->archive.archive_format_name = "GNU tar format"; + err = header_gnutar(a, tar, entry, h); + } else if (memcmp(header->magic, "ustar", 5) == 0) { + if (a->archive.archive_format != ARCHIVE_FORMAT_TAR_PAX_INTERCHANGE) { + a->archive.archive_format = ARCHIVE_FORMAT_TAR_USTAR; + a->archive.archive_format_name = "POSIX ustar format"; + } + err = header_ustar(a, tar, entry, h); + } else { + a->archive.archive_format = ARCHIVE_FORMAT_TAR; + a->archive.archive_format_name = "tar (non-POSIX)"; + err = header_old_tar(a, tar, entry, h); + } + } + --tar->header_recursion_depth; + /* We return warnings or success as-is. Anything else is fatal. */ + if (err == ARCHIVE_WARN || err == ARCHIVE_OK) + return (err); + if (err == ARCHIVE_EOF) + /* EOF when recursively reading a header is bad. */ + archive_set_error(&a->archive, EINVAL, "Damaged tar archive"); + return (ARCHIVE_FATAL); +} + +/* + * Return true if block checksum is correct. + */ +static int +checksum(struct archive_read *a, const void *h) +{ + const unsigned char *bytes; + const struct archive_entry_header_ustar *header; + int check, i, sum; + + (void)a; /* UNUSED */ + bytes = (const unsigned char *)h; + header = (const struct archive_entry_header_ustar *)h; + + /* + * Test the checksum. Note that POSIX specifies _unsigned_ + * bytes for this calculation. + */ + sum = tar_atol(header->checksum, sizeof(header->checksum)); + check = 0; + for (i = 0; i < 148; i++) + check += (unsigned char)bytes[i]; + for (; i < 156; i++) + check += 32; + for (; i < 512; i++) + check += (unsigned char)bytes[i]; + if (sum == check) + return (1); + + /* + * Repeat test with _signed_ bytes, just in case this archive + * was created by an old BSD, Solaris, or HP-UX tar with a + * broken checksum calculation. + */ + check = 0; + for (i = 0; i < 148; i++) + check += (signed char)bytes[i]; + for (; i < 156; i++) + check += 32; + for (; i < 512; i++) + check += (signed char)bytes[i]; + if (sum == check) + return (1); + + return (0); +} + +/* + * Return true if this block contains only nulls. + */ +static int +archive_block_is_null(const unsigned char *p) +{ + unsigned i; + + for (i = 0; i < 512; i++) + if (*p++) + return (0); + return (1); +} + +/* + * Interpret 'A' Solaris ACL header + */ +static int +header_Solaris_ACL(struct archive_read *a, struct tar *tar, + struct archive_entry *entry, const void *h) +{ + const struct archive_entry_header_ustar *header; + size_t size; + int err; +#ifndef __minix + int64_t type; +#else + int32_t type; +#endif + char *acl, *p; + wchar_t *wp; + + /* + * read_body_to_string adds a NUL terminator, but we need a little + * more to make sure that we don't overrun acl_text later. + */ + header = (const struct archive_entry_header_ustar *)h; + size = tar_atol(header->size, sizeof(header->size)); + err = read_body_to_string(a, tar, &(tar->acl_text), h); + if (err != ARCHIVE_OK) + return (err); + /* Recursively read next header */ + err = tar_read_header(a, tar, entry); + if ((err != ARCHIVE_OK) && (err != ARCHIVE_WARN)) + return (err); + + /* TODO: Examine the first characters to see if this + * is an AIX ACL descriptor. We'll likely never support + * them, but it would be polite to recognize and warn when + * we do see them. */ + + /* Leading octal number indicates ACL type and number of entries. */ + p = acl = tar->acl_text.s; + type = 0; + while (*p != '\0' && p < acl + size) { + if (*p < '0' || *p > '7') { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Malformed Solaris ACL attribute (invalid digit)"); + return(ARCHIVE_WARN); + } + type <<= 3; + type += *p - '0'; + if (type > 077777777) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Malformed Solaris ACL attribute (count too large)"); + return (ARCHIVE_WARN); + } + p++; + } + switch ((int)type & ~0777777) { + case 01000000: + /* POSIX.1e ACL */ + break; + case 03000000: + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Solaris NFSv4 ACLs not supported"); + return (ARCHIVE_WARN); + default: + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Malformed Solaris ACL attribute (unsupported type %o)", + (int)type); + return (ARCHIVE_WARN); + } + p++; + + if (p >= acl + size) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Malformed Solaris ACL attribute (body overflow)"); + return(ARCHIVE_WARN); + } + + /* ACL text is null-terminated; find the end. */ + size -= (p - acl); + acl = p; + + while (*p != '\0' && p < acl + size) + p++; + + wp = utf8_decode(tar, acl, p - acl); + err = __archive_entry_acl_parse_w(entry, wp, + ARCHIVE_ENTRY_ACL_TYPE_ACCESS); + if (err != ARCHIVE_OK) + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Malformed Solaris ACL attribute (unparsable)"); + return (err); +} + +/* + * Interpret 'K' long linkname header. + */ +static int +header_longlink(struct archive_read *a, struct tar *tar, + struct archive_entry *entry, const void *h) +{ + int err; + + err = read_body_to_string(a, tar, &(tar->longlink), h); + if (err != ARCHIVE_OK) + return (err); + err = tar_read_header(a, tar, entry); + if ((err != ARCHIVE_OK) && (err != ARCHIVE_WARN)) + return (err); + /* Set symlink if symlink already set, else hardlink. */ + archive_entry_copy_link(entry, tar->longlink.s); + return (ARCHIVE_OK); +} + +/* + * Interpret 'L' long filename header. + */ +static int +header_longname(struct archive_read *a, struct tar *tar, + struct archive_entry *entry, const void *h) +{ + int err; + + err = read_body_to_string(a, tar, &(tar->longname), h); + if (err != ARCHIVE_OK) + return (err); + /* Read and parse "real" header, then override name. */ + err = tar_read_header(a, tar, entry); + if ((err != ARCHIVE_OK) && (err != ARCHIVE_WARN)) + return (err); + archive_entry_copy_pathname(entry, tar->longname.s); + return (ARCHIVE_OK); +} + + +/* + * Interpret 'V' GNU tar volume header. + */ +static int +header_volume(struct archive_read *a, struct tar *tar, + struct archive_entry *entry, const void *h) +{ + (void)h; + + /* Just skip this and read the next header. */ + return (tar_read_header(a, tar, entry)); +} + +/* + * Read body of an archive entry into an archive_string object. + */ +static int +read_body_to_string(struct archive_read *a, struct tar *tar, + struct archive_string *as, const void *h) +{ + off_t size, padded_size; + const struct archive_entry_header_ustar *header; + const void *src; + + (void)tar; /* UNUSED */ + header = (const struct archive_entry_header_ustar *)h; + size = tar_atol(header->size, sizeof(header->size)); + if ((size > 1048576) || (size < 0)) { + archive_set_error(&a->archive, EINVAL, + "Special header too large"); + return (ARCHIVE_FATAL); + } + + /* Fail if we can't make our buffer big enough. */ + if (archive_string_ensure(as, size+1) == NULL) { + archive_set_error(&a->archive, ENOMEM, + "No memory"); + return (ARCHIVE_FATAL); + } + + /* Read the body into the string. */ + padded_size = (size + 511) & ~ 511; + src = __archive_read_ahead(a, padded_size, NULL); + if (src == NULL) + return (ARCHIVE_FATAL); + memcpy(as->s, src, size); + __archive_read_consume(a, padded_size); + as->s[size] = '\0'; + return (ARCHIVE_OK); +} + +/* + * Parse out common header elements. + * + * This would be the same as header_old_tar, except that the + * filename is handled slightly differently for old and POSIX + * entries (POSIX entries support a 'prefix'). This factoring + * allows header_old_tar and header_ustar + * to handle filenames differently, while still putting most of the + * common parsing into one place. + */ +static int +header_common(struct archive_read *a, struct tar *tar, + struct archive_entry *entry, const void *h) +{ + const struct archive_entry_header_ustar *header; + char tartype; + + (void)a; /* UNUSED */ + + header = (const struct archive_entry_header_ustar *)h; + if (header->linkname[0]) + archive_strncpy(&(tar->entry_linkpath), header->linkname, + sizeof(header->linkname)); + else + archive_string_empty(&(tar->entry_linkpath)); + + /* Parse out the numeric fields (all are octal) */ + archive_entry_set_mode(entry, tar_atol(header->mode, sizeof(header->mode))); + archive_entry_set_uid(entry, tar_atol(header->uid, sizeof(header->uid))); + archive_entry_set_gid(entry, tar_atol(header->gid, sizeof(header->gid))); + tar->entry_bytes_remaining = tar_atol(header->size, sizeof(header->size)); + tar->realsize = tar->entry_bytes_remaining; + archive_entry_set_size(entry, tar->entry_bytes_remaining); + archive_entry_set_mtime(entry, tar_atol(header->mtime, sizeof(header->mtime)), 0); + + /* Handle the tar type flag appropriately. */ + tartype = header->typeflag[0]; + + switch (tartype) { + case '1': /* Hard link */ + archive_entry_copy_hardlink(entry, tar->entry_linkpath.s); + /* + * The following may seem odd, but: Technically, tar + * does not store the file type for a "hard link" + * entry, only the fact that it is a hard link. So, I + * leave the type zero normally. But, pax interchange + * format allows hard links to have data, which + * implies that the underlying entry is a regular + * file. + */ + if (archive_entry_size(entry) > 0) + archive_entry_set_filetype(entry, AE_IFREG); + + /* + * A tricky point: Traditionally, tar readers have + * ignored the size field when reading hardlink + * entries, and some writers put non-zero sizes even + * though the body is empty. POSIX blessed this + * convention in the 1988 standard, but broke with + * this tradition in 2001 by permitting hardlink + * entries to store valid bodies in pax interchange + * format, but not in ustar format. Since there is no + * hard and fast way to distinguish pax interchange + * from earlier archives (the 'x' and 'g' entries are + * optional, after all), we need a heuristic. + */ + if (archive_entry_size(entry) == 0) { + /* If the size is already zero, we're done. */ + } else if (a->archive.archive_format + == ARCHIVE_FORMAT_TAR_PAX_INTERCHANGE) { + /* Definitely pax extended; must obey hardlink size. */ + } else if (a->archive.archive_format == ARCHIVE_FORMAT_TAR + || a->archive.archive_format == ARCHIVE_FORMAT_TAR_GNUTAR) + { + /* Old-style or GNU tar: we must ignore the size. */ + archive_entry_set_size(entry, 0); + tar->entry_bytes_remaining = 0; + } else if (archive_read_format_tar_bid(a) > 50) { + /* + * We don't know if it's pax: If the bid + * function sees a valid ustar header + * immediately following, then let's ignore + * the hardlink size. + */ + archive_entry_set_size(entry, 0); + tar->entry_bytes_remaining = 0; + } + /* + * TODO: There are still two cases I'd like to handle: + * = a ustar non-pax archive with a hardlink entry at + * end-of-archive. (Look for block of nulls following?) + * = a pax archive that has not seen any pax headers + * and has an entry which is a hardlink entry storing + * a body containing an uncompressed tar archive. + * The first is worth addressing; I don't see any reliable + * way to deal with the second possibility. + */ + break; + case '2': /* Symlink */ + archive_entry_set_filetype(entry, AE_IFLNK); + archive_entry_set_size(entry, 0); + tar->entry_bytes_remaining = 0; + archive_entry_copy_symlink(entry, tar->entry_linkpath.s); + break; + case '3': /* Character device */ + archive_entry_set_filetype(entry, AE_IFCHR); + archive_entry_set_size(entry, 0); + tar->entry_bytes_remaining = 0; + break; + case '4': /* Block device */ + archive_entry_set_filetype(entry, AE_IFBLK); + archive_entry_set_size(entry, 0); + tar->entry_bytes_remaining = 0; + break; + case '5': /* Dir */ + archive_entry_set_filetype(entry, AE_IFDIR); + archive_entry_set_size(entry, 0); + tar->entry_bytes_remaining = 0; + break; + case '6': /* FIFO device */ + archive_entry_set_filetype(entry, AE_IFIFO); + archive_entry_set_size(entry, 0); + tar->entry_bytes_remaining = 0; + break; + case 'D': /* GNU incremental directory type */ + /* + * No special handling is actually required here. + * It might be nice someday to preprocess the file list and + * provide it to the client, though. + */ + archive_entry_set_filetype(entry, AE_IFDIR); + break; + case 'M': /* GNU "Multi-volume" (remainder of file from last archive)*/ + /* + * As far as I can tell, this is just like a regular file + * entry, except that the contents should be _appended_ to + * the indicated file at the indicated offset. This may + * require some API work to fully support. + */ + break; + case 'N': /* Old GNU "long filename" entry. */ + /* The body of this entry is a script for renaming + * previously-extracted entries. Ugh. It will never + * be supported by libarchive. */ + archive_entry_set_filetype(entry, AE_IFREG); + break; + case 'S': /* GNU sparse files */ + /* + * Sparse files are really just regular files with + * sparse information in the extended area. + */ + /* FALLTHROUGH */ + default: /* Regular file and non-standard types */ + /* + * Per POSIX: non-recognized types should always be + * treated as regular files. + */ + archive_entry_set_filetype(entry, AE_IFREG); + break; + } + return (0); +} + +/* + * Parse out header elements for "old-style" tar archives. + */ +static int +header_old_tar(struct archive_read *a, struct tar *tar, + struct archive_entry *entry, const void *h) +{ + const struct archive_entry_header_ustar *header; + + /* Copy filename over (to ensure null termination). */ + header = (const struct archive_entry_header_ustar *)h; + archive_strncpy(&(tar->entry_pathname), header->name, sizeof(header->name)); + archive_entry_copy_pathname(entry, tar->entry_pathname.s); + + /* Grab rest of common fields */ + header_common(a, tar, entry, h); + + tar->entry_padding = 0x1ff & (-tar->entry_bytes_remaining); + return (0); +} + +/* + * Parse a file header for a pax extended archive entry. + */ +static int +header_pax_global(struct archive_read *a, struct tar *tar, + struct archive_entry *entry, const void *h) +{ + int err; + + err = read_body_to_string(a, tar, &(tar->pax_global), h); + if (err != ARCHIVE_OK) + return (err); + err = tar_read_header(a, tar, entry); + return (err); +} + +static int +header_pax_extensions(struct archive_read *a, struct tar *tar, + struct archive_entry *entry, const void *h) +{ + int err, err2; + + err = read_body_to_string(a, tar, &(tar->pax_header), h); + if (err != ARCHIVE_OK) + return (err); + + /* Parse the next header. */ + err = tar_read_header(a, tar, entry); + if ((err != ARCHIVE_OK) && (err != ARCHIVE_WARN)) + return (err); + + /* + * TODO: Parse global/default options into 'entry' struct here + * before handling file-specific options. + * + * This design (parse standard header, then overwrite with pax + * extended attribute data) usually works well, but isn't ideal; + * it would be better to parse the pax extended attributes first + * and then skip any fields in the standard header that were + * defined in the pax header. + */ + err2 = pax_header(a, tar, entry, tar->pax_header.s); + err = err_combine(err, err2); + tar->entry_padding = 0x1ff & (-tar->entry_bytes_remaining); + return (err); +} + + +/* + * Parse a file header for a Posix "ustar" archive entry. This also + * handles "pax" or "extended ustar" entries. + */ +static int +header_ustar(struct archive_read *a, struct tar *tar, + struct archive_entry *entry, const void *h) +{ + const struct archive_entry_header_ustar *header; + struct archive_string *as; + + header = (const struct archive_entry_header_ustar *)h; + + /* Copy name into an internal buffer to ensure null-termination. */ + as = &(tar->entry_pathname); + if (header->prefix[0]) { + archive_strncpy(as, header->prefix, sizeof(header->prefix)); + if (as->s[archive_strlen(as) - 1] != '/') + archive_strappend_char(as, '/'); + archive_strncat(as, header->name, sizeof(header->name)); + } else + archive_strncpy(as, header->name, sizeof(header->name)); + + archive_entry_copy_pathname(entry, as->s); + + /* Handle rest of common fields. */ + header_common(a, tar, entry, h); + + /* Handle POSIX ustar fields. */ + archive_strncpy(&(tar->entry_uname), header->uname, + sizeof(header->uname)); + archive_entry_copy_uname(entry, tar->entry_uname.s); + + archive_strncpy(&(tar->entry_gname), header->gname, + sizeof(header->gname)); + archive_entry_copy_gname(entry, tar->entry_gname.s); + + /* Parse out device numbers only for char and block specials. */ + if (header->typeflag[0] == '3' || header->typeflag[0] == '4') { + archive_entry_set_rdevmajor(entry, + tar_atol(header->rdevmajor, sizeof(header->rdevmajor))); + archive_entry_set_rdevminor(entry, + tar_atol(header->rdevminor, sizeof(header->rdevminor))); + } + + tar->entry_padding = 0x1ff & (-tar->entry_bytes_remaining); + + return (0); +} + + +/* + * Parse the pax extended attributes record. + * + * Returns non-zero if there's an error in the data. + */ +static int +pax_header(struct archive_read *a, struct tar *tar, + struct archive_entry *entry, char *attr) +{ + size_t attr_length, l, line_length; + char *p; + char *key, *value; + int err, err2; + + attr_length = strlen(attr); + tar->pax_hdrcharset_binary = 0; + archive_string_empty(&(tar->entry_gname)); + archive_string_empty(&(tar->entry_linkpath)); + archive_string_empty(&(tar->entry_pathname)); + archive_string_empty(&(tar->entry_pathname_override)); + archive_string_empty(&(tar->entry_uname)); + err = ARCHIVE_OK; + while (attr_length > 0) { + /* Parse decimal length field at start of line. */ + line_length = 0; + l = attr_length; + p = attr; /* Record start of line. */ + while (l>0) { + if (*p == ' ') { + p++; + l--; + break; + } + if (*p < '0' || *p > '9') { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Ignoring malformed pax extended attributes"); + return (ARCHIVE_WARN); + } + line_length *= 10; + line_length += *p - '0'; + if (line_length > 999999) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Rejecting pax extended attribute > 1MB"); + return (ARCHIVE_WARN); + } + p++; + l--; + } + + /* + * Parsed length must be no bigger than available data, + * at least 1, and the last character of the line must + * be '\n'. + */ + if (line_length > attr_length + || line_length < 1 + || attr[line_length - 1] != '\n') + { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Ignoring malformed pax extended attribute"); + return (ARCHIVE_WARN); + } + + /* Null-terminate the line. */ + attr[line_length - 1] = '\0'; + + /* Find end of key and null terminate it. */ + key = p; + if (key[0] == '=') + return (-1); + while (*p && *p != '=') + ++p; + if (*p == '\0') { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Invalid pax extended attributes"); + return (ARCHIVE_WARN); + } + *p = '\0'; + + /* Identify null-terminated 'value' portion. */ + value = p + 1; + + /* Identify this attribute and set it in the entry. */ + err2 = pax_attribute(tar, entry, key, value); + err = err_combine(err, err2); + + /* Skip to next line */ + attr += line_length; + attr_length -= line_length; + } + if (archive_strlen(&(tar->entry_gname)) > 0) { + value = tar->entry_gname.s; + if (tar->pax_hdrcharset_binary) + archive_entry_copy_gname(entry, value); + else { + if (!archive_entry_update_gname_utf8(entry, value)) { + err = ARCHIVE_WARN; + archive_set_error(&a->archive, + ARCHIVE_ERRNO_FILE_FORMAT, + "Gname in pax header can't " + "be converted to current locale."); + } + } + } + if (archive_strlen(&(tar->entry_linkpath)) > 0) { + value = tar->entry_linkpath.s; + if (tar->pax_hdrcharset_binary) + archive_entry_copy_link(entry, value); + else { + if (!archive_entry_update_link_utf8(entry, value)) { + err = ARCHIVE_WARN; + archive_set_error(&a->archive, + ARCHIVE_ERRNO_FILE_FORMAT, + "Linkname in pax header can't " + "be converted to current locale."); + } + } + } + /* + * Some extensions (such as the GNU sparse file extensions) + * deliberately store a synthetic name under the regular 'path' + * attribute and the real file name under a different attribute. + * Since we're supposed to not care about the order, we + * have no choice but to store all of the various filenames + * we find and figure it all out afterwards. This is the + * figuring out part. + */ + value = NULL; + if (archive_strlen(&(tar->entry_pathname_override)) > 0) + value = tar->entry_pathname_override.s; + else if (archive_strlen(&(tar->entry_pathname)) > 0) + value = tar->entry_pathname.s; + if (value != NULL) { + if (tar->pax_hdrcharset_binary) + archive_entry_copy_pathname(entry, value); + else { + if (!archive_entry_update_pathname_utf8(entry, value)) { + err = ARCHIVE_WARN; + archive_set_error(&a->archive, + ARCHIVE_ERRNO_FILE_FORMAT, + "Pathname in pax header can't be " + "converted to current locale."); + } + } + } + if (archive_strlen(&(tar->entry_uname)) > 0) { + value = tar->entry_uname.s; + if (tar->pax_hdrcharset_binary) + archive_entry_copy_uname(entry, value); + else { + if (!archive_entry_update_uname_utf8(entry, value)) { + err = ARCHIVE_WARN; + archive_set_error(&a->archive, + ARCHIVE_ERRNO_FILE_FORMAT, + "Uname in pax header can't " + "be converted to current locale."); + } + } + } + return (err); +} + +static int +pax_attribute_xattr(struct archive_entry *entry, + char *name, char *value) +{ + char *name_decoded; + void *value_decoded; + size_t value_len; + + if (strlen(name) < 18 || (strncmp(name, "LIBARCHIVE.xattr.", 17)) != 0) + return 3; + + name += 17; + + /* URL-decode name */ + name_decoded = url_decode(name); + if (name_decoded == NULL) + return 2; + + /* Base-64 decode value */ + value_decoded = base64_decode(value, strlen(value), &value_len); + if (value_decoded == NULL) { + free(name_decoded); + return 1; + } + + archive_entry_xattr_add_entry(entry, name_decoded, + value_decoded, value_len); + + free(name_decoded); + free(value_decoded); + return 0; +} + +/* + * Parse a single key=value attribute. key/value pointers are + * assumed to point into reasonably long-lived storage. + * + * Note that POSIX reserves all-lowercase keywords. Vendor-specific + * extensions should always have keywords of the form "VENDOR.attribute" + * In particular, it's quite feasible to support many different + * vendor extensions here. I'm using "LIBARCHIVE" for extensions + * unique to this library. + * + * Investigate other vendor-specific extensions and see if + * any of them look useful. + */ +static int +pax_attribute(struct tar *tar, struct archive_entry *entry, + char *key, char *value) +{ +#ifndef __minix + int64_t s; +#else + time_t s; +#endif + long n; + wchar_t *wp; + + switch (key[0]) { + case 'G': + /* GNU "0.0" sparse pax format. */ + if (strcmp(key, "GNU.sparse.numblocks") == 0) { + tar->sparse_offset = -1; + tar->sparse_numbytes = -1; + tar->sparse_gnu_major = 0; + tar->sparse_gnu_minor = 0; + } + if (strcmp(key, "GNU.sparse.offset") == 0) { + tar->sparse_offset = tar_atol10(value, strlen(value)); + if (tar->sparse_numbytes != -1) { + gnu_add_sparse_entry(tar, + tar->sparse_offset, tar->sparse_numbytes); + tar->sparse_offset = -1; + tar->sparse_numbytes = -1; + } + } + if (strcmp(key, "GNU.sparse.numbytes") == 0) { + tar->sparse_numbytes = tar_atol10(value, strlen(value)); + if (tar->sparse_numbytes != -1) { + gnu_add_sparse_entry(tar, + tar->sparse_offset, tar->sparse_numbytes); + tar->sparse_offset = -1; + tar->sparse_numbytes = -1; + } + } + if (strcmp(key, "GNU.sparse.size") == 0) { + tar->realsize = tar_atol10(value, strlen(value)); + archive_entry_set_size(entry, tar->realsize); + } + + /* GNU "0.1" sparse pax format. */ + if (strcmp(key, "GNU.sparse.map") == 0) { + tar->sparse_gnu_major = 0; + tar->sparse_gnu_minor = 1; + if (gnu_sparse_01_parse(tar, value) != ARCHIVE_OK) + return (ARCHIVE_WARN); + } + + /* GNU "1.0" sparse pax format */ + if (strcmp(key, "GNU.sparse.major") == 0) { + tar->sparse_gnu_major = tar_atol10(value, strlen(value)); + tar->sparse_gnu_pending = 1; + } + if (strcmp(key, "GNU.sparse.minor") == 0) { + tar->sparse_gnu_minor = tar_atol10(value, strlen(value)); + tar->sparse_gnu_pending = 1; + } + if (strcmp(key, "GNU.sparse.name") == 0) { + /* + * The real filename; when storing sparse + * files, GNU tar puts a synthesized name into + * the regular 'path' attribute in an attempt + * to limit confusion. ;-) + */ + archive_strcpy(&(tar->entry_pathname_override), value); + } + if (strcmp(key, "GNU.sparse.realsize") == 0) { + tar->realsize = tar_atol10(value, strlen(value)); + archive_entry_set_size(entry, tar->realsize); + } + break; + case 'L': + /* Our extensions */ +/* TODO: Handle arbitrary extended attributes... */ +/* + if (strcmp(key, "LIBARCHIVE.xxxxxxx")==0) + archive_entry_set_xxxxxx(entry, value); +*/ + if (strcmp(key, "LIBARCHIVE.creationtime")==0) { + pax_time(value, &s, &n); + archive_entry_set_birthtime(entry, s, n); + } + if (strncmp(key, "LIBARCHIVE.xattr.", 17)==0) + pax_attribute_xattr(entry, key, value); + break; + case 'S': + /* We support some keys used by the "star" archiver */ + if (strcmp(key, "SCHILY.acl.access")==0) { + wp = utf8_decode(tar, value, strlen(value)); + /* TODO: if (wp == NULL) */ + __archive_entry_acl_parse_w(entry, wp, + ARCHIVE_ENTRY_ACL_TYPE_ACCESS); + } else if (strcmp(key, "SCHILY.acl.default")==0) { + wp = utf8_decode(tar, value, strlen(value)); + /* TODO: if (wp == NULL) */ + __archive_entry_acl_parse_w(entry, wp, + ARCHIVE_ENTRY_ACL_TYPE_DEFAULT); + } else if (strcmp(key, "SCHILY.devmajor")==0) { + archive_entry_set_rdevmajor(entry, + tar_atol10(value, strlen(value))); + } else if (strcmp(key, "SCHILY.devminor")==0) { + archive_entry_set_rdevminor(entry, + tar_atol10(value, strlen(value))); + } else if (strcmp(key, "SCHILY.fflags")==0) { + archive_entry_copy_fflags_text(entry, value); + } else if (strcmp(key, "SCHILY.dev")==0) { + archive_entry_set_dev(entry, + tar_atol10(value, strlen(value))); + } else if (strcmp(key, "SCHILY.ino")==0) { + archive_entry_set_ino(entry, + tar_atol10(value, strlen(value))); + } else if (strcmp(key, "SCHILY.nlink")==0) { + archive_entry_set_nlink(entry, + tar_atol10(value, strlen(value))); + } else if (strcmp(key, "SCHILY.realsize")==0) { + tar->realsize = tar_atol10(value, strlen(value)); + archive_entry_set_size(entry, tar->realsize); + } + break; + case 'a': + if (strcmp(key, "atime")==0) { + pax_time(value, &s, &n); + archive_entry_set_atime(entry, s, n); + } + break; + case 'c': + if (strcmp(key, "ctime")==0) { + pax_time(value, &s, &n); + archive_entry_set_ctime(entry, s, n); + } else if (strcmp(key, "charset")==0) { + /* TODO: Publish charset information in entry. */ + } else if (strcmp(key, "comment")==0) { + /* TODO: Publish comment in entry. */ + } + break; + case 'g': + if (strcmp(key, "gid")==0) { + archive_entry_set_gid(entry, + tar_atol10(value, strlen(value))); + } else if (strcmp(key, "gname")==0) { + archive_strcpy(&(tar->entry_gname), value); + } + break; + case 'h': + if (strcmp(key, "hdrcharset") == 0) { + if (strcmp(value, "BINARY") == 0) + tar->pax_hdrcharset_binary = 1; + else if (strcmp(value, "ISO-IR 10646 2000 UTF-8") == 0) + tar->pax_hdrcharset_binary = 0; + else { + /* TODO: Warn about unsupported hdrcharset */ + } + } + break; + case 'l': + /* pax interchange doesn't distinguish hardlink vs. symlink. */ + if (strcmp(key, "linkpath")==0) { + archive_strcpy(&(tar->entry_linkpath), value); + } + break; + case 'm': + if (strcmp(key, "mtime")==0) { + pax_time(value, &s, &n); + archive_entry_set_mtime(entry, s, n); + } + break; + case 'p': + if (strcmp(key, "path")==0) { + archive_strcpy(&(tar->entry_pathname), value); + } + break; + case 'r': + /* POSIX has reserved 'realtime.*' */ + break; + case 's': + /* POSIX has reserved 'security.*' */ + /* Someday: if (strcmp(key, "security.acl")==0) { ... } */ + if (strcmp(key, "size")==0) { + /* "size" is the size of the data in the entry. */ + tar->entry_bytes_remaining + = tar_atol10(value, strlen(value)); + /* + * But, "size" is not necessarily the size of + * the file on disk; if this is a sparse file, + * the disk size may have already been set from + * GNU.sparse.realsize or GNU.sparse.size or + * an old GNU header field or SCHILY.realsize + * or .... + */ + if (tar->realsize < 0) { + archive_entry_set_size(entry, + tar->entry_bytes_remaining); + tar->realsize + = tar->entry_bytes_remaining; + } + } + break; + case 'u': + if (strcmp(key, "uid")==0) { + archive_entry_set_uid(entry, + tar_atol10(value, strlen(value))); + } else if (strcmp(key, "uname")==0) { + archive_strcpy(&(tar->entry_uname), value); + } + break; + } + return (0); +} + + + +/* + * parse a decimal time value, which may include a fractional portion + */ +#ifndef __minix +static void +pax_time(const char *p, int64_t *ps, long *pn) +{ + char digit; + int64_t s; + unsigned long l; + int sign; + int64_t limit, last_digit_limit; + + limit = INT64_MAX / 10; + last_digit_limit = INT64_MAX % 10; + + s = 0; + sign = 1; + if (*p == '-') { + sign = -1; + p++; + } + while (*p >= '0' && *p <= '9') { + digit = *p - '0'; + if (s > limit || + (s == limit && digit > last_digit_limit)) { + s = INT64_MAX; + break; + } + s = (s * 10) + digit; + ++p; + } + + *ps = s * sign; + + /* Calculate nanoseconds. */ + *pn = 0; + + if (*p != '.') + return; + + l = 100000000UL; + do { + ++p; + if (*p >= '0' && *p <= '9') + *pn += (*p - '0') * l; + else + break; + } while (l /= 10); +} +#else +static void +pax_time(const char *p, time_t *ps, long *pn) +{ + char digit; + time_t s; + unsigned long l; + int sign; + int32_t limit, last_digit_limit; + + limit = INT32_MAX / 10; + last_digit_limit = INT32_MAX % 10; + + s = 0; + sign = 1; + if (*p == '-') { + sign = -1; + p++; + } + while (*p >= '0' && *p <= '9') { + digit = *p - '0'; + if (s > limit || + (s == limit && digit > last_digit_limit)) { + s = INT32_MAX; + break; + } + s = (s * 10) + digit; + ++p; + } + + *ps = s * sign; + + /* Calculate nanoseconds. */ + *pn = 0; + + if (*p != '.') + return; + + l = 100000000UL; + do { + ++p; + if (*p >= '0' && *p <= '9') + *pn += (*p - '0') * l; + else + break; + } while (l /= 10); +} +#endif +/* + * Parse GNU tar header + */ +static int +header_gnutar(struct archive_read *a, struct tar *tar, + struct archive_entry *entry, const void *h) +{ + const struct archive_entry_header_gnutar *header; + + (void)a; + + /* + * GNU header is like POSIX ustar, except 'prefix' is + * replaced with some other fields. This also means the + * filename is stored as in old-style archives. + */ + + /* Grab fields common to all tar variants. */ + header_common(a, tar, entry, h); + + /* Copy filename over (to ensure null termination). */ + header = (const struct archive_entry_header_gnutar *)h; + archive_strncpy(&(tar->entry_pathname), header->name, + sizeof(header->name)); + archive_entry_copy_pathname(entry, tar->entry_pathname.s); + + /* Fields common to ustar and GNU */ + /* XXX Can the following be factored out since it's common + * to ustar and gnu tar? Is it okay to move it down into + * header_common, perhaps? */ + archive_strncpy(&(tar->entry_uname), + header->uname, sizeof(header->uname)); + archive_entry_copy_uname(entry, tar->entry_uname.s); + + archive_strncpy(&(tar->entry_gname), + header->gname, sizeof(header->gname)); + archive_entry_copy_gname(entry, tar->entry_gname.s); + + /* Parse out device numbers only for char and block specials */ + if (header->typeflag[0] == '3' || header->typeflag[0] == '4') { + archive_entry_set_rdevmajor(entry, + tar_atol(header->rdevmajor, sizeof(header->rdevmajor))); + archive_entry_set_rdevminor(entry, + tar_atol(header->rdevminor, sizeof(header->rdevminor))); + } else + archive_entry_set_rdev(entry, 0); + + tar->entry_padding = 0x1ff & (-tar->entry_bytes_remaining); + + /* Grab GNU-specific fields. */ + archive_entry_set_atime(entry, + tar_atol(header->atime, sizeof(header->atime)), 0); + archive_entry_set_ctime(entry, + tar_atol(header->ctime, sizeof(header->ctime)), 0); + if (header->realsize[0] != 0) { + tar->realsize + = tar_atol(header->realsize, sizeof(header->realsize)); + archive_entry_set_size(entry, tar->realsize); + } + + if (header->sparse[0].offset[0] != 0) { + gnu_sparse_old_read(a, tar, header); + } else { + if (header->isextended[0] != 0) { + /* XXX WTF? XXX */ + } + } + + return (0); +} + +static void +gnu_add_sparse_entry(struct tar *tar, off_t offset, off_t remaining) +{ + struct sparse_block *p; + + p = (struct sparse_block *)malloc(sizeof(*p)); + if (p == NULL) + __archive_errx(1, "Out of memory"); + memset(p, 0, sizeof(*p)); + if (tar->sparse_last != NULL) + tar->sparse_last->next = p; + else + tar->sparse_list = p; + tar->sparse_last = p; + p->offset = offset; + p->remaining = remaining; +} + +static void +gnu_clear_sparse_list(struct tar *tar) +{ + struct sparse_block *p; + + while (tar->sparse_list != NULL) { + p = tar->sparse_list; + tar->sparse_list = p->next; + free(p); + } + tar->sparse_last = NULL; +} + +/* + * GNU tar old-format sparse data. + * + * GNU old-format sparse data is stored in a fixed-field + * format. Offset/size values are 11-byte octal fields (same + * format as 'size' field in ustart header). These are + * stored in the header, allocating subsequent header blocks + * as needed. Extending the header in this way is a pretty + * severe POSIX violation; this design has earned GNU tar a + * lot of criticism. + */ + +static int +gnu_sparse_old_read(struct archive_read *a, struct tar *tar, + const struct archive_entry_header_gnutar *header) +{ + ssize_t bytes_read; + const void *data; + struct extended { + struct gnu_sparse sparse[21]; + char isextended[1]; + char padding[7]; + }; + const struct extended *ext; + + gnu_sparse_old_parse(tar, header->sparse, 4); + if (header->isextended[0] == 0) + return (ARCHIVE_OK); + + do { + data = __archive_read_ahead(a, 512, &bytes_read); + if (bytes_read < 0) + return (ARCHIVE_FATAL); + if (bytes_read < 512) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Truncated tar archive " + "detected while reading sparse file data"); + return (ARCHIVE_FATAL); + } + __archive_read_consume(a, 512); + ext = (const struct extended *)data; + gnu_sparse_old_parse(tar, ext->sparse, 21); + } while (ext->isextended[0] != 0); + if (tar->sparse_list != NULL) + tar->entry_offset = tar->sparse_list->offset; + return (ARCHIVE_OK); +} + +static void +gnu_sparse_old_parse(struct tar *tar, + const struct gnu_sparse *sparse, int length) +{ + while (length > 0 && sparse->offset[0] != 0) { + gnu_add_sparse_entry(tar, + tar_atol(sparse->offset, sizeof(sparse->offset)), + tar_atol(sparse->numbytes, sizeof(sparse->numbytes))); + sparse++; + length--; + } +} + +/* + * GNU tar sparse format 0.0 + * + * Beginning with GNU tar 1.15, sparse files are stored using + * information in the pax extended header. The GNU tar maintainers + * have gone through a number of variations in the process of working + * out this scheme; furtunately, they're all numbered. + * + * Sparse format 0.0 uses attribute GNU.sparse.numblocks to store the + * number of blocks, and GNU.sparse.offset/GNU.sparse.numbytes to + * store offset/size for each block. The repeated instances of these + * latter fields violate the pax specification (which frowns on + * duplicate keys), so this format was quickly replaced. + */ + +/* + * GNU tar sparse format 0.1 + * + * This version replaced the offset/numbytes attributes with + * a single "map" attribute that stored a list of integers. This + * format had two problems: First, the "map" attribute could be very + * long, which caused problems for some implementations. More + * importantly, the sparse data was lost when extracted by archivers + * that didn't recognize this extension. + */ + +static int +gnu_sparse_01_parse(struct tar *tar, const char *p) +{ + const char *e; + off_t offset = -1, size = -1; + + for (;;) { + e = p; + while (*e != '\0' && *e != ',') { + if (*e < '0' || *e > '9') + return (ARCHIVE_WARN); + e++; + } + if (offset < 0) { + offset = tar_atol10(p, e - p); + if (offset < 0) + return (ARCHIVE_WARN); + } else { + size = tar_atol10(p, e - p); + if (size < 0) + return (ARCHIVE_WARN); + gnu_add_sparse_entry(tar, offset, size); + offset = -1; + } + if (*e == '\0') + return (ARCHIVE_OK); + p = e + 1; + } +} + +/* + * GNU tar sparse format 1.0 + * + * The idea: The offset/size data is stored as a series of base-10 + * ASCII numbers prepended to the file data, so that dearchivers that + * don't support this format will extract the block map along with the + * data and a separate post-process can restore the sparseness. + * + * Unfortunately, GNU tar 1.16 had a bug that added unnecessary + * padding to the body of the file when using this format. GNU tar + * 1.17 corrected this bug without bumping the version number, so + * it's not possible to support both variants. This code supports + * the later variant at the expense of not supporting the former. + * + * This variant also replaced GNU.sparse.size with GNU.sparse.realsize + * and introduced the GNU.sparse.major/GNU.sparse.minor attributes. + */ + +/* + * Read the next line from the input, and parse it as a decimal + * integer followed by '\n'. Returns positive integer value or + * negative on error. + */ +#ifndef __minix +static int64_t +gnu_sparse_10_atol(struct archive_read *a, struct tar *tar, + ssize_t *remaining) +{ + int64_t l, limit, last_digit_limit; + const char *p; + ssize_t bytes_read; + int base, digit; + + base = 10; + limit = INT64_MAX / base; + last_digit_limit = INT64_MAX % base; + + /* + * Skip any lines starting with '#'; GNU tar specs + * don't require this, but they should. + */ + do { + bytes_read = readline(a, tar, &p, tar_min(*remaining, 100)); + if (bytes_read <= 0) + return (ARCHIVE_FATAL); + *remaining -= bytes_read; + } while (p[0] == '#'); + + l = 0; + while (bytes_read > 0) { + if (*p == '\n') + return (l); + if (*p < '0' || *p >= '0' + base) + return (ARCHIVE_WARN); + digit = *p - '0'; + if (l > limit || (l == limit && digit > last_digit_limit)) + l = INT64_MAX; /* Truncate on overflow. */ + else + l = (l * base) + digit; + p++; + bytes_read--; + } + /* TODO: Error message. */ + return (ARCHIVE_WARN); +} +#else +static int32_t +gnu_sparse_10_atol(struct archive_read *a, struct tar *tar, + ssize_t *remaining) +{ + int32_t l, limit, last_digit_limit; + const char *p; + ssize_t bytes_read; + int base, digit; + + base = 10; + limit = INT32_MAX / base; + last_digit_limit = INT32_MAX % base; + + /* + * Skip any lines starting with '#'; GNU tar specs + * don't require this, but they should. + */ + do { + bytes_read = readline(a, tar, &p, tar_min(*remaining, 100)); + if (bytes_read <= 0) + return (ARCHIVE_FATAL); + *remaining -= bytes_read; + } while (p[0] == '#'); + + l = 0; + while (bytes_read > 0) { + if (*p == '\n') + return (l); + if (*p < '0' || *p >= '0' + base) + return (ARCHIVE_WARN); + digit = *p - '0'; + if (l > limit || (l == limit && digit > last_digit_limit)) + l = INT32_MAX; /* Truncate on overflow. */ + else + l = (l * base) + digit; + p++; + bytes_read--; + } + /* TODO: Error message. */ + return (ARCHIVE_WARN); +} +#endif +/* + * Returns length (in bytes) of the sparse data description + * that was read. + */ +static ssize_t +gnu_sparse_10_read(struct archive_read *a, struct tar *tar) +{ + ssize_t remaining, bytes_read; + int entries; + off_t offset, size, to_skip; + + /* Clear out the existing sparse list. */ + gnu_clear_sparse_list(tar); + + remaining = tar->entry_bytes_remaining; + + /* Parse entries. */ + entries = gnu_sparse_10_atol(a, tar, &remaining); + if (entries < 0) + return (ARCHIVE_FATAL); + /* Parse the individual entries. */ + while (entries-- > 0) { + /* Parse offset/size */ + offset = gnu_sparse_10_atol(a, tar, &remaining); + if (offset < 0) + return (ARCHIVE_FATAL); + size = gnu_sparse_10_atol(a, tar, &remaining); + if (size < 0) + return (ARCHIVE_FATAL); + /* Add a new sparse entry. */ + gnu_add_sparse_entry(tar, offset, size); + } + /* Skip rest of block... */ + bytes_read = tar->entry_bytes_remaining - remaining; + to_skip = 0x1ff & -bytes_read; + if (to_skip != __archive_read_skip(a, to_skip)) + return (ARCHIVE_FATAL); + return (bytes_read + to_skip); +} + +/*- + * Convert text->integer. + * + * Traditional tar formats (including POSIX) specify base-8 for + * all of the standard numeric fields. This is a significant limitation + * in practice: + * = file size is limited to 8GB + * = rdevmajor and rdevminor are limited to 21 bits + * = uid/gid are limited to 21 bits + * + * There are two workarounds for this: + * = pax extended headers, which use variable-length string fields + * = GNU tar and STAR both allow either base-8 or base-256 in + * most fields. The high bit is set to indicate base-256. + * + * On read, this implementation supports both extensions. + */ +#ifndef __minix +static int64_t +tar_atol(const char *p, unsigned char_cnt) +{ + /* + * Technically, GNU tar considers a field to be in base-256 + * only if the first byte is 0xff or 0x80. + */ + if (*p & 0x80) + return (tar_atol256(p, char_cnt)); + return (tar_atol8(p, char_cnt)); +} +#else +static int32_t +tar_atol(const char *p, unsigned char_cnt) +{ + /* + * Technically, GNU tar considers a field to be in base-256 + * only if the first byte is 0xff or 0x80. + */ + if (*p & 0x80) + return (tar_atol256(p, char_cnt)); + return (tar_atol8(p, char_cnt)); +} +#endif + +/* + * Note that this implementation does not (and should not!) obey + * locale settings; you cannot simply substitute strtol here, since + * it does obey locale. + */ +#ifndef __minix +static int64_t +tar_atol8(const char *p, unsigned char_cnt) +{ + int64_t l, limit, last_digit_limit; + int digit, sign, base; + + base = 8; + limit = INT64_MAX / base; + last_digit_limit = INT64_MAX % base; + + while (*p == ' ' || *p == '\t') + p++; + if (*p == '-') { + sign = -1; + p++; + } else + sign = 1; + + l = 0; + digit = *p - '0'; + while (digit >= 0 && digit < base && char_cnt-- > 0) { + if (l>limit || (l == limit && digit > last_digit_limit)) { + l = INT64_MAX; /* Truncate on overflow. */ + break; + } + l = (l * base) + digit; + digit = *++p - '0'; + } + return (sign < 0) ? -l : l; +} +#else +static int32_t +tar_atol8(const char *p, unsigned char_cnt) +{ + int32_t l, limit, last_digit_limit; + int digit, sign, base; + + base = 8; + limit = INT32_MAX / base; + last_digit_limit = INT32_MAX % base; + + while (*p == ' ' || *p == '\t') + p++; + if (*p == '-') { + sign = -1; + p++; + } else + sign = 1; + + l = 0; + digit = *p - '0'; + while (digit >= 0 && digit < base && char_cnt-- > 0) { + if (l>limit || (l == limit && digit > last_digit_limit)) { + l = INT32_MAX; /* Truncate on overflow. */ + break; + } + l = (l * base) + digit; + digit = *++p - '0'; + } + return (sign < 0) ? -l : l; +} +#endif +/* + * Note that this implementation does not (and should not!) obey + * locale settings; you cannot simply substitute strtol here, since + * it does obey locale. + */ +#ifndef __minix +static int64_t +tar_atol10(const char *p, unsigned char_cnt) +{ + int64_t l, limit, last_digit_limit; + int base, digit, sign; + + base = 10; + limit = INT64_MAX / base; + last_digit_limit = INT64_MAX % base; + + while (*p == ' ' || *p == '\t') + p++; + if (*p == '-') { + sign = -1; + p++; + } else + sign = 1; + + l = 0; + digit = *p - '0'; + while (digit >= 0 && digit < base && char_cnt-- > 0) { + if (l > limit || (l == limit && digit > last_digit_limit)) { + l = INT64_MAX; /* Truncate on overflow. */ + break; + } + l = (l * base) + digit; + digit = *++p - '0'; + } + return (sign < 0) ? -l : l; +} +#else +static int32_t +tar_atol10(const char *p, unsigned char_cnt) +{ + int32_t l, limit, last_digit_limit; + int base, digit, sign; + + base = 10; + limit = INT32_MAX / base; + last_digit_limit = INT32_MAX % base; + + while (*p == ' ' || *p == '\t') + p++; + if (*p == '-') { + sign = -1; + p++; + } else + sign = 1; + + l = 0; + digit = *p - '0'; + while (digit >= 0 && digit < base && char_cnt-- > 0) { + if (l > limit || (l == limit && digit > last_digit_limit)) { + l = INT32_MAX; /* Truncate on overflow. */ + break; + } + l = (l * base) + digit; + digit = *++p - '0'; + } + return (sign < 0) ? -l : l; +} +#endif +/* + * Parse a base-256 integer. This is just a straight signed binary + * value in big-endian order, except that the high-order bit is + * ignored. + */ +#ifndef __minix +static int64_t +tar_atol256(const char *_p, unsigned char_cnt) +{ + int64_t l, upper_limit, lower_limit; + const unsigned char *p = (const unsigned char *)_p; + + upper_limit = INT64_MAX / 256; + lower_limit = INT64_MIN / 256; + + /* Pad with 1 or 0 bits, depending on sign. */ + if ((0x40 & *p) == 0x40) + l = (int64_t)-1; + else + l = 0; + l = (l << 6) | (0x3f & *p++); + while (--char_cnt > 0) { + if (l > upper_limit) { + l = INT64_MAX; /* Truncate on overflow */ + break; + } else if (l < lower_limit) { + l = INT64_MIN; + break; + } + l = (l << 8) | (0xff & (int64_t)*p++); + } + return (l); +} +#else +static int32_t +tar_atol256(const char *_p, unsigned char_cnt) +{ + int32_t l, upper_limit, lower_limit; + const unsigned char *p = (const unsigned char *)_p; + + upper_limit = INT32_MAX / 256; + lower_limit = INT32_MIN / 256; + + /* Pad with 1 or 0 bits, depending on sign. */ + if ((0x40 & *p) == 0x40) + l = (int32_t)-1; + else + l = 0; + l = (l << 6) | (0x3f & *p++); + while (--char_cnt > 0) { + if (l > upper_limit) { + l = INT32_MAX; /* Truncate on overflow */ + break; + } else if (l < lower_limit) { + l = INT32_MIN; + break; + } + l = (l << 8) | (0xff & (int32_t)*p++); + } + return (l); +} +#endif +/* + * Returns length of line (including trailing newline) + * or negative on error. 'start' argument is updated to + * point to first character of line. This avoids copying + * when possible. + */ +static ssize_t +readline(struct archive_read *a, struct tar *tar, const char **start, + ssize_t limit) +{ + ssize_t bytes_read; + ssize_t total_size = 0; + const void *t; + const char *s; + void *p; + + t = __archive_read_ahead(a, 1, &bytes_read); + if (bytes_read <= 0) + return (ARCHIVE_FATAL); + s = t; /* Start of line? */ + p = memchr(t, '\n', bytes_read); + /* If we found '\n' in the read buffer, return pointer to that. */ + if (p != NULL) { + bytes_read = 1 + ((const char *)p) - s; + if (bytes_read > limit) { + archive_set_error(&a->archive, + ARCHIVE_ERRNO_FILE_FORMAT, + "Line too long"); + return (ARCHIVE_FATAL); + } + __archive_read_consume(a, bytes_read); + *start = s; + return (bytes_read); + } + /* Otherwise, we need to accumulate in a line buffer. */ + for (;;) { + if (total_size + bytes_read > limit) { + archive_set_error(&a->archive, + ARCHIVE_ERRNO_FILE_FORMAT, + "Line too long"); + return (ARCHIVE_FATAL); + } + if (archive_string_ensure(&tar->line, total_size + bytes_read) == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate working buffer"); + return (ARCHIVE_FATAL); + } + memcpy(tar->line.s + total_size, t, bytes_read); + __archive_read_consume(a, bytes_read); + total_size += bytes_read; + /* If we found '\n', clean up and return. */ + if (p != NULL) { + *start = tar->line.s; + return (total_size); + } + /* Read some more. */ + t = __archive_read_ahead(a, 1, &bytes_read); + if (bytes_read <= 0) + return (ARCHIVE_FATAL); + s = t; /* Start of line? */ + p = memchr(t, '\n', bytes_read); + /* If we found '\n', trim the read. */ + if (p != NULL) { + bytes_read = 1 + ((const char *)p) - s; + } + } +} + +static wchar_t * +utf8_decode(struct tar *tar, const char *src, size_t length) +{ + wchar_t *dest; + ssize_t n; + + /* Ensure pax_entry buffer is big enough. */ + if (tar->pax_entry_length <= length) { + wchar_t *old_entry; + + if (tar->pax_entry_length <= 0) + tar->pax_entry_length = 1024; + while (tar->pax_entry_length <= length + 1) + tar->pax_entry_length *= 2; + + old_entry = tar->pax_entry; + tar->pax_entry = (wchar_t *)realloc(tar->pax_entry, + tar->pax_entry_length * sizeof(wchar_t)); + if (tar->pax_entry == NULL) { + free(old_entry); + /* TODO: Handle this error. */ + return (NULL); + } + } + + dest = tar->pax_entry; + while (length > 0) { + n = UTF8_mbrtowc(dest, src, length); + if (n < 0) + return (NULL); + if (n == 0) + break; + dest++; + src += n; + length -= n; + } + *dest = L'\0'; + return (tar->pax_entry); +} + +/* + * Copied and simplified from FreeBSD libc/locale. + */ +static ssize_t +UTF8_mbrtowc(wchar_t *pwc, const char *s, size_t n) +{ + int ch, i, len, mask; + unsigned long wch; + + if (s == NULL || n == 0 || pwc == NULL) + return (0); + + /* + * Determine the number of octets that make up this character from + * the first octet, and a mask that extracts the interesting bits of + * the first octet. + */ + ch = (unsigned char)*s; + if ((ch & 0x80) == 0) { + mask = 0x7f; + len = 1; + } else if ((ch & 0xe0) == 0xc0) { + mask = 0x1f; + len = 2; + } else if ((ch & 0xf0) == 0xe0) { + mask = 0x0f; + len = 3; + } else if ((ch & 0xf8) == 0xf0) { + mask = 0x07; + len = 4; + } else { + /* Invalid first byte. */ + return (-1); + } + + if (n < (size_t)len) { + /* Valid first byte but truncated. */ + return (-2); + } + + /* + * Decode the octet sequence representing the character in chunks + * of 6 bits, most significant first. + */ + wch = (unsigned char)*s++ & mask; + i = len; + while (--i != 0) { + if ((*s & 0xc0) != 0x80) { + /* Invalid intermediate byte; consume one byte and + * emit '?' */ + *pwc = '?'; + return (1); + } + wch <<= 6; + wch |= *s++ & 0x3f; + } + + /* Assign the value to the output; out-of-range values + * just get truncated. */ + *pwc = (wchar_t)wch; +#ifdef WCHAR_MAX + /* + * If platform has WCHAR_MAX, we can do something + * more sensible with out-of-range values. + */ + if (wch >= WCHAR_MAX) + *pwc = '?'; +#endif + /* Return number of bytes input consumed: 0 for end-of-string. */ + return (wch == L'\0' ? 0 : len); +} + + +/* + * base64_decode - Base64 decode + * + * This accepts most variations of base-64 encoding, including: + * * with or without line breaks + * * with or without the final group padded with '=' or '_' characters + * (The most economical Base-64 variant does not pad the last group and + * omits line breaks; RFC1341 used for MIME requires both.) + */ +static char * +base64_decode(const char *s, size_t len, size_t *out_len) +{ + static const unsigned char digits[64] = { + 'A','B','C','D','E','F','G','H','I','J','K','L','M','N', + 'O','P','Q','R','S','T','U','V','W','X','Y','Z','a','b', + 'c','d','e','f','g','h','i','j','k','l','m','n','o','p', + 'q','r','s','t','u','v','w','x','y','z','0','1','2','3', + '4','5','6','7','8','9','+','/' }; + static unsigned char decode_table[128]; + char *out, *d; + const unsigned char *src = (const unsigned char *)s; + + /* If the decode table is not yet initialized, prepare it. */ + if (decode_table[digits[1]] != 1) { + unsigned i; + memset(decode_table, 0xff, sizeof(decode_table)); + for (i = 0; i < sizeof(digits); i++) + decode_table[digits[i]] = i; + } + + /* Allocate enough space to hold the entire output. */ + /* Note that we may not use all of this... */ + out = (char *)malloc(len - len / 4 + 1); + if (out == NULL) { + *out_len = 0; + return (NULL); + } + d = out; + + while (len > 0) { + /* Collect the next group of (up to) four characters. */ + int v = 0; + int group_size = 0; + while (group_size < 4 && len > 0) { + /* '=' or '_' padding indicates final group. */ + if (*src == '=' || *src == '_') { + len = 0; + break; + } + /* Skip illegal characters (including line breaks) */ + if (*src > 127 || *src < 32 + || decode_table[*src] == 0xff) { + len--; + src++; + continue; + } + v <<= 6; + v |= decode_table[*src++]; + len --; + group_size++; + } + /* Align a short group properly. */ + v <<= 6 * (4 - group_size); + /* Unpack the group we just collected. */ + switch (group_size) { + case 4: d[2] = v & 0xff; + /* FALLTHROUGH */ + case 3: d[1] = (v >> 8) & 0xff; + /* FALLTHROUGH */ + case 2: d[0] = (v >> 16) & 0xff; + break; + case 1: /* this is invalid! */ + break; + } + d += group_size * 3 / 4; + } + + *out_len = d - out; + return (out); +} + +static char * +url_decode(const char *in) +{ + char *out, *d; + const char *s; + + out = (char *)malloc(strlen(in) + 1); + if (out == NULL) + return (NULL); + for (s = in, d = out; *s != '\0'; ) { + if (s[0] == '%' && s[1] != '\0' && s[2] != '\0') { + /* Try to convert % escape */ + int digit1 = tohex(s[1]); + int digit2 = tohex(s[2]); + if (digit1 >= 0 && digit2 >= 0) { + /* Looks good, consume three chars */ + s += 3; + /* Convert output */ + *d++ = ((digit1 << 4) | digit2); + continue; + } + /* Else fall through and treat '%' as normal char */ + } + *d++ = *s++; + } + *d = '\0'; + return (out); +} + +static int +tohex(int c) +{ + if (c >= '0' && c <= '9') + return (c - '0'); + else if (c >= 'A' && c <= 'F') + return (c - 'A' + 10); + else if (c >= 'a' && c <= 'f') + return (c - 'a' + 10); + else + return (-1); +} diff --git a/lib/libarchive/archive_read_support_format_xar.c b/lib/libarchive/archive_read_support_format_xar.c new file mode 100644 index 000000000..b91497542 --- /dev/null +++ b/lib/libarchive/archive_read_support_format_xar.c @@ -0,0 +1,3150 @@ +/*- + * Copyright (c) 2009 Michihiro NAKAJIMA + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ +#include "archive_platform.h" +__FBSDID("$FreeBSD$"); + +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif +#if HAVE_LIBXML_XMLREADER_H +#include +#elif HAVE_BSDXML_H +#include +#elif HAVE_EXPAT_H +#include +#endif +#ifdef HAVE_BZLIB_H +#include +#endif +#if HAVE_LZMA_H +#include +#elif HAVE_LZMADEC_H +#include +#endif +#ifdef HAVE_ZLIB_H +#include +#endif + +#include "archive.h" +#include "archive_endian.h" +#include "archive_entry.h" +#include "archive_hash.h" +#include "archive_private.h" +#include "archive_read_private.h" + +#if (!defined(HAVE_LIBXML_XMLREADER_H) && \ + !defined(HAVE_BSDXML_H) && !defined(HAVE_EXPAT_H)) ||\ + !defined(HAVE_ZLIB_H) || \ + !defined(ARCHIVE_HAS_MD5) || !defined(ARCHIVE_HAS_SHA1) +/* + * xar needs several external libraries. + * o libxml2 or expat --- XML parser + * o openssl or MD5/SHA1 hash function + * o zlib + * o bzlib2 (option) + * o liblzma (option) + */ +int +archive_read_support_format_xar(struct archive *_a) +{ + struct archive_read *a = (struct archive_read *)_a; + + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Xar not supported on this platform"); + return (ARCHIVE_WARN); +} + +#else /* Support xar format */ + +//#define DEBUG 1 +//#define DEBUG_PRINT_TOC 1 +#if DEBUG_PRINT_TOC +#define PRINT_TOC(d, outbytes) do { \ + unsigned char *x = (unsigned char *)(uintptr_t)d; \ + unsigned char c = x[outbytes-1]; \ + x[outbytes - 1] = 0; \ + fprintf(stderr, "%s", x); \ + fprintf(stderr, "%c", c); \ + x[outbytes - 1] = c; \ +} while (0) +#else +#define PRINT_TOC(d, outbytes) +#endif + +#define HEADER_MAGIC 0x78617221 +#define HEADER_SIZE 28 +#define HEADER_VERSION 1 +#define CKSUM_NONE 0 +#define CKSUM_SHA1 1 +#define CKSUM_MD5 2 + +#define MD5_SIZE 16 +#define SHA1_SIZE 20 +#define MAX_SUM_SIZE 20 + +enum enctype { + NONE, + GZIP, + BZIP2, + LZMA, + XZ, +}; + +struct chksumval { + int alg; + size_t len; + unsigned char val[MAX_SUM_SIZE]; +}; + +struct chksumwork { + int alg; +#ifdef ARCHIVE_HAS_MD5 + archive_md5_ctx md5ctx; +#endif +#ifdef ARCHIVE_HAS_SHA1 + archive_sha1_ctx sha1ctx; +#endif +}; + +struct xattr { + struct xattr *next; + struct archive_string name; + uint64_t id; + uint64_t length; + uint64_t offset; + uint64_t size; + enum enctype encoding; + struct chksumval a_sum; + struct chksumval e_sum; + struct archive_string fstype; +}; + +struct xar_file { + struct xar_file *next; + struct xar_file *hdnext; + struct xar_file *parent; + int subdirs; + + unsigned int has; +#define HAS_DATA 0x00001 +#define HAS_PATHNAME 0x00002 +#define HAS_SYMLINK 0x00004 +#define HAS_TIME 0x00008 +#define HAS_UID 0x00010 +#define HAS_GID 0x00020 +#define HAS_MODE 0x00040 +#define HAS_TYPE 0x00080 +#define HAS_DEV 0x00100 +#define HAS_DEVMAJOR 0x00200 +#define HAS_DEVMINOR 0x00400 +#define HAS_INO 0x00800 +#define HAS_FFLAGS 0x01000 +#define HAS_XATTR 0x02000 +#define HAS_ACL 0x04000 + + uint64_t id; + uint64_t length; + uint64_t offset; + uint64_t size; + enum enctype encoding; + struct chksumval a_sum; + struct chksumval e_sum; + struct archive_string pathname; + struct archive_string symlink; + time_t ctime; + time_t mtime; + time_t atime; + struct archive_string uname; + uid_t uid; + struct archive_string gname; + gid_t gid; + mode_t mode; + dev_t dev; + dev_t devmajor; + dev_t devminor; + int64_t ino64; + struct archive_string fflags_text; + unsigned int link; + unsigned int nlink; + struct archive_string hardlink; + struct xattr *xattr_list; +}; + +struct hdlink { + struct hdlink *next; + + unsigned int id; + int cnt; + struct xar_file *files; +}; + +struct heap_queue { + struct xar_file **files; + int allocated; + int used; +}; + +enum xmlstatus { + INIT, + XAR, + TOC, + TOC_CREATION_TIME, + TOC_CHECKSUM, + TOC_CHECKSUM_OFFSET, + TOC_CHECKSUM_SIZE, + TOC_FILE, + FILE_DATA, + FILE_DATA_LENGTH, + FILE_DATA_OFFSET, + FILE_DATA_SIZE, + FILE_DATA_ENCODING, + FILE_DATA_A_CHECKSUM, + FILE_DATA_E_CHECKSUM, + FILE_DATA_CONTENT, + FILE_EA, + FILE_EA_LENGTH, + FILE_EA_OFFSET, + FILE_EA_SIZE, + FILE_EA_ENCODING, + FILE_EA_A_CHECKSUM, + FILE_EA_E_CHECKSUM, + FILE_EA_NAME, + FILE_EA_FSTYPE, + FILE_CTIME, + FILE_MTIME, + FILE_ATIME, + FILE_GROUP, + FILE_GID, + FILE_USER, + FILE_UID, + FILE_MODE, + FILE_DEVICE, + FILE_DEVICE_MAJOR, + FILE_DEVICE_MINOR, + FILE_DEVICENO, + FILE_INODE, + FILE_LINK, + FILE_TYPE, + FILE_NAME, + FILE_ACL, + FILE_ACL_DEFAULT, + FILE_ACL_ACCESS, + FILE_ACL_APPLEEXTENDED, + /* BSD file flags. */ + FILE_FLAGS, + FILE_FLAGS_USER_NODUMP, + FILE_FLAGS_USER_IMMUTABLE, + FILE_FLAGS_USER_APPEND, + FILE_FLAGS_USER_OPAQUE, + FILE_FLAGS_USER_NOUNLINK, + FILE_FLAGS_SYS_ARCHIVED, + FILE_FLAGS_SYS_IMMUTABLE, + FILE_FLAGS_SYS_APPEND, + FILE_FLAGS_SYS_NOUNLINK, + FILE_FLAGS_SYS_SNAPSHOT, + /* Linux file flags. */ + FILE_EXT2, + FILE_EXT2_SecureDeletion, + FILE_EXT2_Undelete, + FILE_EXT2_Compress, + FILE_EXT2_Synchronous, + FILE_EXT2_Immutable, + FILE_EXT2_AppendOnly, + FILE_EXT2_NoDump, + FILE_EXT2_NoAtime, + FILE_EXT2_CompDirty, + FILE_EXT2_CompBlock, + FILE_EXT2_NoCompBlock, + FILE_EXT2_CompError, + FILE_EXT2_BTree, + FILE_EXT2_HashIndexed, + FILE_EXT2_iMagic, + FILE_EXT2_Journaled, + FILE_EXT2_NoTail, + FILE_EXT2_DirSync, + FILE_EXT2_TopDir, + FILE_EXT2_Reserved, + UNKNOWN, +}; + +struct unknown_tag { + struct unknown_tag *next; + struct archive_string name; +}; + +struct xar { + uint64_t offset; /* Current position in the file. */ + int64_t total; + uint64_t h_base; + int end_of_file; + unsigned char buff[1024*32]; + + enum xmlstatus xmlsts; + enum xmlstatus xmlsts_unknown; + struct unknown_tag *unknowntags; + int base64text; + + /* + * TOC + */ + uint64_t toc_remaining; + uint64_t toc_total; + uint64_t toc_chksum_offset; + uint64_t toc_chksum_size; + + /* + * For Decoding data. + */ + enum enctype rd_encoding; + z_stream stream; + int stream_valid; +#ifdef HAVE_BZLIB_H + bz_stream bzstream; + int bzstream_valid; +#endif +#if HAVE_LZMA_H && HAVE_LIBLZMA + lzma_stream lzstream; + int lzstream_valid; +#elif HAVE_LZMADEC_H && HAVE_LIBLZMADEC + lzmadec_stream lzstream; + int lzstream_valid; +#endif + /* + * For Checksum data. + */ + struct chksumwork a_sumwrk; + struct chksumwork e_sumwrk; + + struct xar_file *file; /* current reading file. */ + struct xattr *xattr; /* current reading extended attribute. */ + struct heap_queue file_queue; + struct xar_file *hdlink_orgs; + struct hdlink *hdlink_list; + + int entry_init; + uint64_t entry_total; + uint64_t entry_remaining; + uint64_t entry_size; + enum enctype entry_encoding; + struct chksumval entry_a_sum; + struct chksumval entry_e_sum; +}; + +struct xmlattr { + struct xmlattr *next; + char *name; + char *value; +}; + +struct xmlattr_list { + struct xmlattr *first; + struct xmlattr **last; +}; + +static int xar_bid(struct archive_read *); +static int xar_read_header(struct archive_read *, + struct archive_entry *); +static int xar_read_data(struct archive_read *, + const void **, size_t *, off_t *); +static int xar_read_data_skip(struct archive_read *); +static int xar_cleanup(struct archive_read *); +static int move_reading_point(struct archive_read *, uint64_t); +static int rd_contents_init(struct archive_read *, + enum enctype, int, int); +static int rd_contents(struct archive_read *, const void **, + size_t *, size_t *, uint64_t); +static uint64_t atol10(const char *, size_t); +static int64_t atol8(const char *, size_t); +static size_t atohex(unsigned char *, size_t, const char *, size_t); +static time_t parse_time(const char *p, size_t n); +static void heap_add_entry(struct heap_queue *, struct xar_file *); +static struct xar_file *heap_get_entry(struct heap_queue *); +static void add_link(struct xar *, struct xar_file *); +static void checksum_init(struct archive_read *, int, int); +static void checksum_update(struct archive_read *, const void *, + size_t, const void *, size_t); +static int checksum_final(struct archive_read *, const void *, + size_t, const void *, size_t); +static int decompression_init(struct archive_read *, enum enctype); +static int decompress(struct archive_read *, const void **, + size_t *, const void *, size_t *); +static int decompression_cleanup(struct archive_read *); +static void xmlattr_cleanup(struct xmlattr_list *); +static void file_new(struct xar *, struct xmlattr_list *); +static void file_free(struct xar_file *); +static void xattr_new(struct xar *, struct xmlattr_list *); +static void xattr_free(struct xattr *); +static int getencoding(struct xmlattr_list *); +static int getsumalgorithm(struct xmlattr_list *); +static void unknowntag_start(struct xar *, const char *); +static void unknowntag_end(struct xar *, const char *); +static void xml_start(void *, const char *, struct xmlattr_list *); +static void xml_end(void *, const char *); +static void xml_data(void *, const char *, int); +static int xml_parse_file_flags(struct xar *, const char *); +static int xml_parse_file_ext2(struct xar *, const char *); +#if defined(HAVE_LIBXML_XMLREADER_H) +static int xml2_xmlattr_setup(struct xmlattr_list *, xmlTextReaderPtr); +static int xml2_read_cb(void *, char *, int); +static int xml2_close_cb(void *); +static void xml2_error_hdr(void *, const char *, xmlParserSeverities, + xmlTextReaderLocatorPtr); +static int xml2_read_toc(struct archive_read *); +#elif defined(HAVE_BSDXML_H) || defined(HAVE_EXPAT_H) +static void expat_xmlattr_setup(struct xmlattr_list *, const XML_Char **); +static void expat_start_cb(void *, const XML_Char *, const XML_Char **); +static void expat_end_cb(void *, const XML_Char *); +static void expat_data_cb(void *, const XML_Char *, int); +static int expat_read_toc(struct archive_read *); +#endif + +int +archive_read_support_format_xar(struct archive *_a) +{ + struct xar *xar; + struct archive_read *a = (struct archive_read *)_a; + int r; + + xar = (struct xar *)calloc(1, sizeof(*xar)); + if (xar == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate xar data"); + return (ARCHIVE_FATAL); + } + + r = __archive_read_register_format(a, + xar, + "xar", + xar_bid, + NULL, + xar_read_header, + xar_read_data, + xar_read_data_skip, + xar_cleanup); + if (r != ARCHIVE_OK) + free(xar); + return (r); +} + +static int +xar_bid(struct archive_read *a) +{ + const unsigned char *b; + int bid; + + b = __archive_read_ahead(a, HEADER_SIZE, NULL); + if (b == NULL) + return (-1); + + bid = 0; + /* + * Verify magic code + */ + if (archive_be32dec(b) != HEADER_MAGIC) + return (0); + bid += 32; + /* + * Verify header size + */ + if (archive_be16dec(b+4) != HEADER_SIZE) + return (0); + bid += 16; + /* + * Verify header version + */ + if (archive_be16dec(b+6) != HEADER_VERSION) + return (0); + bid += 16; + /* + * Verify type of checksum + */ + switch (archive_be32dec(b+24)) { + case CKSUM_NONE: + case CKSUM_SHA1: + case CKSUM_MD5: + bid += 32; + break; + default: + return (0); + } + + return (bid); +} + +static int +read_toc(struct archive_read *a) +{ + struct xar *xar; + struct xar_file *file; + const unsigned char *b; + uint64_t toc_compressed_size; + uint64_t toc_uncompressed_size; + uint32_t toc_chksum_alg; + ssize_t bytes; + int r; + + xar = (struct xar *)(a->format->data); + + /* + * Read xar header. + */ + b = __archive_read_ahead(a, HEADER_SIZE, &bytes); + if (bytes < 0) + return ((int)bytes); + if (bytes < HEADER_SIZE) { + archive_set_error(&a->archive, + ARCHIVE_ERRNO_FILE_FORMAT, + "Truncated archive header"); + return (ARCHIVE_FATAL); + } + + if (archive_be32dec(b) != HEADER_MAGIC) { + archive_set_error(&a->archive, + ARCHIVE_ERRNO_FILE_FORMAT, + "Invalid header magic"); + return (ARCHIVE_FATAL); + } + if (archive_be16dec(b+6) != HEADER_VERSION) { + archive_set_error(&a->archive, + ARCHIVE_ERRNO_FILE_FORMAT, + "Unsupported header version(%d)", + archive_be16dec(b+6)); + return (ARCHIVE_FATAL); + } + toc_compressed_size = archive_be64dec(b+8); + xar->toc_remaining = toc_compressed_size; + toc_uncompressed_size = archive_be64dec(b+16); + toc_chksum_alg = archive_be32dec(b+24); + __archive_read_consume(a, HEADER_SIZE); + xar->offset += HEADER_SIZE; + xar->toc_total = 0; + + /* + * Read TOC(Table of Contents). + */ + /* Initialize reading contents. */ + r = move_reading_point(a, HEADER_SIZE); + if (r != ARCHIVE_OK) + return (r); + r = rd_contents_init(a, GZIP, toc_chksum_alg, CKSUM_NONE); + if (r != ARCHIVE_OK) + return (r); + +#ifdef HAVE_LIBXML_XMLREADER_H + r = xml2_read_toc(a); +#elif defined(HAVE_BSDXML_H) || defined(HAVE_EXPAT_H) + r = expat_read_toc(a); +#endif + if (r != ARCHIVE_OK) + return (r); + + /* Set 'The HEAP' base. */ + xar->h_base = xar->offset; + if (xar->toc_total != toc_uncompressed_size) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "TOC uncompressed size error"); + return (ARCHIVE_FATAL); + } + + /* + * Checksum TOC + */ + if (toc_chksum_alg != CKSUM_NONE) { + r = move_reading_point(a, xar->toc_chksum_offset); + if (r != ARCHIVE_OK) + return (r); + b = __archive_read_ahead(a, xar->toc_chksum_size, &bytes); + if (bytes < 0) + return ((int)bytes); + if ((uint64_t)bytes < xar->toc_chksum_size) { + archive_set_error(&a->archive, + ARCHIVE_ERRNO_FILE_FORMAT, + "Truncated archive file"); + return (ARCHIVE_FATAL); + } + r = checksum_final(a, b, xar->toc_chksum_size, NULL, 0); + __archive_read_consume(a, xar->toc_chksum_size); + xar->offset += xar->toc_chksum_size; + if (r != ARCHIVE_OK) + return (ARCHIVE_FATAL); + } + + /* + * Connect hardlinked files. + */ + for (file = xar->hdlink_orgs; file != NULL; file = file->hdnext) { + struct hdlink **hdlink; + + for (hdlink = &(xar->hdlink_list); *hdlink != NULL; + hdlink = &((*hdlink)->next)) { + if ((*hdlink)->id == file->id) { + struct hdlink *hltmp; + struct xar_file *f2; + int nlink = (*hdlink)->cnt + 1; + + file->nlink = nlink; + for (f2 = (*hdlink)->files; f2 != NULL; + f2 = f2->hdnext) { + f2->nlink = nlink; + archive_string_copy( + &(f2->hardlink), &(file->pathname)); + } + /* Remove resolved files from hdlist_list. */ + hltmp = *hdlink; + *hdlink = hltmp->next; + free(hltmp); + break; + } + } + } + a->archive.archive_format = ARCHIVE_FORMAT_XAR; + a->archive.archive_format_name = "xar"; + + return (ARCHIVE_OK); +} + +static int +xar_read_header(struct archive_read *a, struct archive_entry *entry) +{ + struct xar *xar; + struct xar_file *file; + struct xattr *xattr; + int r; + + xar = (struct xar *)(a->format->data); + + if (xar->offset == 0) { + /* Read TOC. */ + r = read_toc(a); + if (r != ARCHIVE_OK) + return (r); + } + + for (;;) { + file = xar->file = heap_get_entry(&(xar->file_queue)); + if (file == NULL) { + xar->end_of_file = 1; + return (ARCHIVE_EOF); + } + if ((file->mode & AE_IFMT) != AE_IFDIR) + break; + if (file->has != (HAS_PATHNAME | HAS_TYPE)) + break; + /* + * If a file type is a directory and it does not have + * any metadata, do not export. + */ + file_free(file); + } + archive_entry_set_atime(entry, file->atime, 0); + archive_entry_set_ctime(entry, file->ctime, 0); + archive_entry_set_mtime(entry, file->mtime, 0); + archive_entry_set_gid(entry, file->gid); + if (file->gname.length > 0) + archive_entry_update_gname_utf8(entry, file->gname.s); + archive_entry_set_uid(entry, file->uid); + if (file->uname.length > 0) + archive_entry_update_uname_utf8(entry, file->uname.s); + archive_entry_set_mode(entry, file->mode); + archive_entry_update_pathname_utf8(entry, file->pathname.s); + if (file->symlink.length > 0) + archive_entry_update_symlink_utf8(entry, file->symlink.s); + /* Set proper nlink. */ + if ((file->mode & AE_IFMT) == AE_IFDIR) + archive_entry_set_nlink(entry, file->subdirs + 2); + else + archive_entry_set_nlink(entry, file->nlink); + archive_entry_set_size(entry, file->size); + if (archive_strlen(&(file->hardlink)) > 0) + archive_entry_update_hardlink_utf8(entry, + file->hardlink.s); + archive_entry_set_ino64(entry, file->ino64); + if (file->has & HAS_DEV) + archive_entry_set_dev(entry, file->dev); + if (file->has & HAS_DEVMAJOR) + archive_entry_set_devmajor(entry, file->devmajor); + if (file->has & HAS_DEVMINOR) + archive_entry_set_devminor(entry, file->devminor); + if (archive_strlen(&(file->fflags_text)) > 0) + archive_entry_copy_fflags_text(entry, file->fflags_text.s); + + xar->entry_init = 1; + xar->entry_total = 0; + xar->entry_remaining = file->length; + xar->entry_size = file->size; + xar->entry_encoding = file->encoding; + xar->entry_a_sum = file->a_sum; + xar->entry_e_sum = file->e_sum; + /* + * Read extended attributes. + */ + r = ARCHIVE_OK; + xattr = file->xattr_list; + while (xattr != NULL) { + const void *d; + size_t outbytes, used; + + r = move_reading_point(a, xattr->offset); + if (r != ARCHIVE_OK) + break; + r = rd_contents_init(a, xattr->encoding, + xattr->a_sum.alg, xattr->e_sum.alg); + if (r != ARCHIVE_OK) + break; + d = NULL; + r = rd_contents(a, &d, &outbytes, &used, xattr->length); + if (r != ARCHIVE_OK) + break; + if (outbytes != xattr->size) { + archive_set_error(&(a->archive), ARCHIVE_ERRNO_MISC, + "Decompressed size error"); + r = ARCHIVE_FATAL; + break; + } + r = checksum_final(a, + xattr->a_sum.val, xattr->a_sum.len, + xattr->e_sum.val, xattr->e_sum.len); + if (r != ARCHIVE_OK) + break; + archive_entry_xattr_add_entry(entry, + xattr->name.s, d, outbytes); + xattr = xattr->next; + } + if (r != ARCHIVE_OK) { + file_free(file); + return (r); + } + + if (xar->entry_remaining > 0) + /* Move reading point to the beginning of current + * file contents. */ + r = move_reading_point(a, file->offset); + else + r = ARCHIVE_OK; + + file_free(file); + return (r); +} + +static int +xar_read_data(struct archive_read *a, + const void **buff, size_t *size, off_t *offset) +{ + struct xar *xar; + size_t used; + int r; + + xar = (struct xar *)(a->format->data); + if (xar->end_of_file || xar->entry_remaining <= 0) { + r = ARCHIVE_EOF; + goto abort_read_data; + } + + if (xar->entry_init) { + r = rd_contents_init(a, xar->entry_encoding, + xar->entry_a_sum.alg, xar->entry_e_sum.alg); + if (r != ARCHIVE_OK) { + xar->entry_remaining = 0; + return (r); + } + xar->entry_init = 0; + } + + *buff = NULL; + r = rd_contents(a, buff, size, &used, xar->entry_remaining); + if (r != ARCHIVE_OK) + goto abort_read_data; + + *offset = xar->entry_total; + xar->entry_total += *size; + xar->total += *size; + xar->offset += used; + xar->entry_remaining -= used; + __archive_read_consume(a, used); + + if (xar->entry_remaining == 0) { + if (xar->entry_total != xar->entry_size) { + archive_set_error(&(a->archive), ARCHIVE_ERRNO_MISC, + "Decompressed size error"); + r = ARCHIVE_FATAL; + goto abort_read_data; + } + r = checksum_final(a, + xar->entry_a_sum.val, xar->entry_a_sum.len, + xar->entry_e_sum.val, xar->entry_e_sum.len); + if (r != ARCHIVE_OK) + goto abort_read_data; + } + + return (ARCHIVE_OK); +abort_read_data: + *buff = NULL; + *size = 0; + *offset = xar->total; + return (r); +} + +static int +xar_read_data_skip(struct archive_read *a) +{ + struct xar *xar; + int64_t bytes_skipped; + + xar = (struct xar *)(a->format->data); + if (xar->end_of_file) + return (ARCHIVE_EOF); + bytes_skipped = __archive_read_skip(a, xar->entry_remaining); + if (bytes_skipped < 0) + return (ARCHIVE_FATAL); + xar->offset += bytes_skipped; + return (ARCHIVE_OK); +} + +static int +xar_cleanup(struct archive_read *a) +{ + struct xar *xar; + struct hdlink *hdlink; + int i; + int r; + + xar = (struct xar *)(a->format->data); + r = decompression_cleanup(a); + hdlink = xar->hdlink_list; + while (hdlink != NULL) { + struct hdlink *next = hdlink->next; + + free(hdlink); + hdlink = next; + } + for (i = 0; i < xar->file_queue.used; i++) + file_free(xar->file_queue.files[i]); + while (xar->unknowntags != NULL) { + struct unknown_tag *tag; + + tag = xar->unknowntags; + xar->unknowntags = tag->next; + archive_string_free(&(tag->name)); + free(tag); + } + free(xar); + a->format->data = NULL; + return (r); +} + +static int +move_reading_point(struct archive_read *a, uint64_t offset) +{ + struct xar *xar; + + xar = (struct xar *)(a->format->data); + if (xar->offset - xar->h_base != offset) { + /* Seek forward to the start of file contents. */ + int64_t step; + + step = offset - (xar->offset - xar->h_base); + if (step > 0) { + step = __archive_read_skip(a, step); + if (step < 0) + return ((int)step); + xar->offset += step; + } else { + archive_set_error(&(a->archive), + ARCHIVE_ERRNO_MISC, + "Cannot seek."); + return (ARCHIVE_FAILED); + } + } + return (ARCHIVE_OK); +} + +static int +rd_contents_init(struct archive_read *a, enum enctype encoding, + int a_sum_alg, int e_sum_alg) +{ + int r; + + /* Init decompress library. */ + if ((r = decompression_init(a, encoding)) != ARCHIVE_OK) + return (r); + /* Init checksum library. */ + checksum_init(a, a_sum_alg, e_sum_alg); + return (ARCHIVE_OK); +} + +static int +rd_contents(struct archive_read *a, const void **buff, size_t *size, + size_t *used, uint64_t remaining) +{ + const unsigned char *b; + ssize_t bytes; + + /* Get whatever bytes are immediately available. */ + b = __archive_read_ahead(a, 1, &bytes); + if (bytes < 0) + return ((int)bytes); + if (bytes == 0) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Truncated archive file"); + return (ARCHIVE_FATAL); + } + if ((uint64_t)bytes > remaining) + bytes = (ssize_t)remaining; + + /* + * Decompress contents of file. + */ + *used = bytes; + if (decompress(a, buff, size, b, used) != ARCHIVE_OK) + return (ARCHIVE_FATAL); + + /* + * Update checksum of a compressed data and a extracted data. + */ + checksum_update(a, b, *used, *buff, *size); + + return (ARCHIVE_OK); +} + +/* + * Note that this implementation does not (and should not!) obey + * locale settings; you cannot simply substitute strtol here, since + * it does obey locale. + */ + +static uint64_t +atol10(const char *p, size_t char_cnt) +{ + uint64_t l; + int digit; + + l = 0; + digit = *p - '0'; + while (digit >= 0 && digit < 10 && char_cnt-- > 0) { + l = (l * 10) + digit; + digit = *++p - '0'; + } + return (l); +} + +static int64_t +atol8(const char *p, size_t char_cnt) +{ + int64_t l; + int digit; + + l = 0; + while (char_cnt-- > 0) { + if (*p >= '0' && *p <= '7') + digit = *p - '0'; + else + break; + p++; + l <<= 3; + l |= digit; + } + return (l); +} + +static size_t +atohex(unsigned char *b, size_t bsize, const char *p, size_t psize) +{ + size_t fbsize = bsize; + + while (bsize && psize > 1) { + unsigned char x; + + if (p[0] >= 'a' && p[0] <= 'z') + x = (p[0] - 'a' + 0x0a) << 4; + else if (p[0] >= 'A' && p[0] <= 'Z') + x = (p[0] - 'A' + 0x0a) << 4; + else if (p[0] >= '0' && p[0] <= '9') + x = (p[0] - '0') << 4; + else + return (-1); + if (p[1] >= 'a' && p[1] <= 'z') + x |= p[1] - 'a' + 0x0a; + else if (p[1] >= 'A' && p[1] <= 'Z') + x |= p[1] - 'A' + 0x0a; + else if (p[1] >= '0' && p[1] <= '9') + x |= p[1] - '0'; + else + return (-1); + + *b++ = x; + bsize--; + p += 2; + psize -= 2; + } + return (fbsize - bsize); +} + +static time_t +time_from_tm(struct tm *t) +{ +#if HAVE_TIMEGM + /* Use platform timegm() if available. */ + return (timegm(t)); +#else + /* Else use direct calculation using POSIX assumptions. */ + /* First, fix up tm_yday based on the year/month/day. */ + mktime(t); + /* Then we can compute timegm() from first principles. */ + return (t->tm_sec + t->tm_min * 60 + t->tm_hour * 3600 + + t->tm_yday * 86400 + (t->tm_year - 70) * 31536000 + + ((t->tm_year - 69) / 4) * 86400 - + ((t->tm_year - 1) / 100) * 86400 + + ((t->tm_year + 299) / 400) * 86400); +#endif +} + +static time_t +parse_time(const char *p, size_t n) +{ + struct tm tm; + time_t t = 0; + int64_t data; + + memset(&tm, 0, sizeof(tm)); + if (n != 20) + return (t); + data = atol10(p, 4); + if (data < 1900) + return (t); + tm.tm_year = (int)data - 1900; + p += 4; + if (*p++ != '-') + return (t); + data = atol10(p, 2); + if (data < 1 || data > 12) + return (t); + tm.tm_mon = (int)data -1; + p += 2; + if (*p++ != '-') + return (t); + data = atol10(p, 2); + if (data < 1 || data > 31) + return (t); + tm.tm_mday = (int)data; + p += 2; + if (*p++ != 'T') + return (t); + data = atol10(p, 2); + if (data < 0 || data > 23) + return (t); + tm.tm_hour = (int)data; + p += 2; + if (*p++ != ':') + return (t); + data = atol10(p, 2); + if (data < 0 || data > 59) + return (t); + tm.tm_min = (int)data; + p += 2; + if (*p++ != ':') + return (t); + data = atol10(p, 2); + if (data < 0 || data > 60) + return (t); + tm.tm_sec = (int)data; +#if 0 + p += 2; + if (*p != 'Z') + return (t); +#endif + + t = time_from_tm(&tm); + + return (t); +} + +static void +heap_add_entry(struct heap_queue *heap, struct xar_file *file) +{ + uint64_t file_id, parent_id; + int hole, parent; + + /* Expand our pending files list as necessary. */ + if (heap->used >= heap->allocated) { + struct xar_file **new_pending_files; + int new_size = heap->allocated * 2; + + if (heap->allocated < 1024) + new_size = 1024; + /* Overflow might keep us from growing the list. */ + if (new_size <= heap->allocated) + __archive_errx(1, "Out of memory"); + new_pending_files = (struct xar_file **) + malloc(new_size * sizeof(new_pending_files[0])); + if (new_pending_files == NULL) + __archive_errx(1, "Out of memory"); + memcpy(new_pending_files, heap->files, + heap->allocated * sizeof(new_pending_files[0])); + if (heap->files != NULL) + free(heap->files); + heap->files = new_pending_files; + heap->allocated = new_size; + } + + file_id = file->id; + + /* + * Start with hole at end, walk it up tree to find insertion point. + */ + hole = heap->used++; + while (hole > 0) { + parent = (hole - 1)/2; + parent_id = heap->files[parent]->id; + if (file_id >= parent_id) { + heap->files[hole] = file; + return; + } + // Move parent into hole <==> move hole up tree. + heap->files[hole] = heap->files[parent]; + hole = parent; + } + heap->files[0] = file; +} + +static struct xar_file * +heap_get_entry(struct heap_queue *heap) +{ + uint64_t a_id, b_id, c_id; + int a, b, c; + struct xar_file *r, *tmp; + + if (heap->used < 1) + return (NULL); + + /* + * The first file in the list is the earliest; we'll return this. + */ + r = heap->files[0]; + + /* + * Move the last item in the heap to the root of the tree + */ + heap->files[0] = heap->files[--(heap->used)]; + + /* + * Rebalance the heap. + */ + a = 0; // Starting element and its heap key + a_id = heap->files[a]->id; + for (;;) { + b = a + a + 1; // First child + if (b >= heap->used) + return (r); + b_id = heap->files[b]->id; + c = b + 1; // Use second child if it is smaller. + if (c < heap->used) { + c_id = heap->files[c]->id; + if (c_id < b_id) { + b = c; + b_id = c_id; + } + } + if (a_id <= b_id) + return (r); + tmp = heap->files[a]; + heap->files[a] = heap->files[b]; + heap->files[b] = tmp; + a = b; + } +} + +static void +add_link(struct xar *xar, struct xar_file *file) +{ + struct hdlink *hdlink; + + for (hdlink = xar->hdlink_list; hdlink != NULL; hdlink = hdlink->next) { + if (hdlink->id == file->link) { + file->hdnext = hdlink->files; + hdlink->cnt++; + hdlink->files = file; + return; + } + } + hdlink = malloc(sizeof(*hdlink)); + if (hdlink == NULL) + __archive_errx(1, "No memory for add_link()"); + file->hdnext = NULL; + hdlink->id = file->link; + hdlink->cnt = 1; + hdlink->files = file; + hdlink->next = xar->hdlink_list; + xar->hdlink_list = hdlink; +} + +static void +_checksum_init(struct chksumwork *sumwrk, int sum_alg) +{ + sumwrk->alg = sum_alg; + switch (sum_alg) { + case CKSUM_NONE: + break; + case CKSUM_SHA1: + archive_sha1_init(&(sumwrk->sha1ctx)); + break; + case CKSUM_MD5: + archive_md5_init(&(sumwrk->md5ctx)); + break; + } +} + +static void +_checksum_update(struct chksumwork *sumwrk, const void *buff, size_t size) +{ + + switch (sumwrk->alg) { + case CKSUM_NONE: + break; + case CKSUM_SHA1: + archive_sha1_update(&(sumwrk->sha1ctx), buff, size); + break; + case CKSUM_MD5: + archive_md5_update(&(sumwrk->md5ctx), buff, size); + break; + } +} + +static int +_checksum_final(struct chksumwork *sumwrk, const void *val, size_t len) +{ + unsigned char sum[MAX_SUM_SIZE]; + int r = ARCHIVE_OK; + + switch (sumwrk->alg) { + case CKSUM_NONE: + break; + case CKSUM_SHA1: + archive_sha1_final(&(sumwrk->sha1ctx), sum); + if (len != SHA1_SIZE || + memcmp(val, sum, SHA1_SIZE) != 0) + r = ARCHIVE_FAILED; + break; + case CKSUM_MD5: + archive_md5_final(&(sumwrk->md5ctx), sum); + if (len != MD5_SIZE || + memcmp(val, sum, MD5_SIZE) != 0) + r = ARCHIVE_FAILED; + break; + } + return (r); +} + +static void +checksum_init(struct archive_read *a, int a_sum_alg, int e_sum_alg) +{ + struct xar *xar; + + xar = (struct xar *)(a->format->data); + _checksum_init(&(xar->a_sumwrk), a_sum_alg); + _checksum_init(&(xar->e_sumwrk), e_sum_alg); +} + +static void +checksum_update(struct archive_read *a, const void *abuff, size_t asize, + const void *ebuff, size_t esize) +{ + struct xar *xar; + + xar = (struct xar *)(a->format->data); + _checksum_update(&(xar->a_sumwrk), abuff, asize); + _checksum_update(&(xar->e_sumwrk), ebuff, esize); +} + +static int +checksum_final(struct archive_read *a, const void *a_sum_val, + size_t a_sum_len, const void *e_sum_val, size_t e_sum_len) +{ + struct xar *xar; + int r; + + xar = (struct xar *)(a->format->data); + r = _checksum_final(&(xar->a_sumwrk), a_sum_val, a_sum_len); + if (r == ARCHIVE_OK) + r = _checksum_final(&(xar->e_sumwrk), e_sum_val, e_sum_len); + if (r != ARCHIVE_OK) + archive_set_error(&(a->archive), ARCHIVE_ERRNO_MISC, + "Sumcheck error"); + return (r); +} + +static int +decompression_init(struct archive_read *a, enum enctype encoding) +{ + struct xar *xar; + const char *detail; + int r; + + xar = (struct xar *)(a->format->data); + xar->rd_encoding = encoding; + switch (encoding) { + case NONE: + break; + case GZIP: + if (xar->stream_valid) + r = inflateReset(&(xar->stream)); + else + r = inflateInit(&(xar->stream)); + if (r != Z_OK) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Couldn't initialize zlib stream."); + return (ARCHIVE_FATAL); + } + xar->stream_valid = 1; + xar->stream.total_in = 0; + xar->stream.total_out = 0; + break; +#ifdef HAVE_BZLIB_H + case BZIP2: + if (xar->bzstream_valid) { + BZ2_bzDecompressEnd(&(xar->bzstream)); + xar->bzstream_valid = 0; + } + r = BZ2_bzDecompressInit(&(xar->bzstream), 0, 0); + if (r == BZ_MEM_ERROR) + r = BZ2_bzDecompressInit(&(xar->bzstream), 0, 1); + if (r != BZ_OK) { + int err = ARCHIVE_ERRNO_MISC; + detail = NULL; + switch (r) { + case BZ_PARAM_ERROR: + detail = "invalid setup parameter"; + break; + case BZ_MEM_ERROR: + err = ENOMEM; + detail = "out of memory"; + break; + case BZ_CONFIG_ERROR: + detail = "mis-compiled library"; + break; + } + archive_set_error(&a->archive, err, + "Internal error initializing decompressor: %s", + detail == NULL ? "??" : detail); + xar->bzstream_valid = 0; + return (ARCHIVE_FATAL); + } + xar->bzstream_valid = 1; + xar->bzstream.total_in_lo32 = 0; + xar->bzstream.total_in_hi32 = 0; + xar->bzstream.total_out_lo32 = 0; + xar->bzstream.total_out_hi32 = 0; + break; +#endif +#if defined(HAVE_LZMA_H) && defined(HAVE_LIBLZMA) + case XZ: + case LZMA: + if (xar->lzstream_valid) { + lzma_end(&(xar->lzstream)); + xar->lzstream_valid = 0; + } + if (xar->entry_encoding == XZ) + r = lzma_stream_decoder(&(xar->lzstream), + (1U << 30),/* memlimit */ + LZMA_CONCATENATED); + else + r = lzma_alone_decoder(&(xar->lzstream), + (1U << 30));/* memlimit */ + if (r != LZMA_OK) { + switch (r) { + case LZMA_MEM_ERROR: + archive_set_error(&a->archive, + ENOMEM, + "Internal error initializing " + "compression library: " + "Cannot allocate memory"); + break; + case LZMA_OPTIONS_ERROR: + archive_set_error(&a->archive, + ARCHIVE_ERRNO_MISC, + "Internal error initializing " + "compression library: " + "Invalid or unsupported options"); + break; + default: + archive_set_error(&a->archive, + ARCHIVE_ERRNO_MISC, + "Internal error initializing " + "lzma library"); + break; + } + return (ARCHIVE_FATAL); + } + xar->lzstream_valid = 1; + xar->lzstream.total_in = 0; + xar->lzstream.total_out = 0; + break; +#elif defined(HAVE_LZMADEC_H) && defined(HAVE_LIBLZMADEC) + case LZMA: + if (xar->lzstream_valid) + lzmadec_end(&(xar->lzstream)); + r = lzmadec_init(&(xar->lzstream)); + if (r != LZMADEC_OK) { + switch (r) { + case LZMADEC_HEADER_ERROR: + archive_set_error(&a->archive, + ARCHIVE_ERRNO_MISC, + "Internal error initializing " + "compression library: " + "invalid header"); + break; + case LZMADEC_MEM_ERROR: + archive_set_error(&a->archive, + ENOMEM, + "Internal error initializing " + "compression library: " + "out of memory"); + break; + } + return (ARCHIVE_FATAL); + } + xar->lzstream_valid = 1; + xar->lzstream.total_in = 0; + xar->lzstream.total_out = 0; + break; +#endif + /* + * Unsupported compression. + */ + default: +#ifndef HAVE_BZLIB_H + case BZIP2: +#endif +#if !defined(HAVE_LZMA_H) || !defined(HAVE_LIBLZMA) +#if !defined(HAVE_LZMADEC_H) || !defined(HAVE_LIBLZMADEC) + case LZMA: +#endif + case XZ: +#endif + switch (xar->entry_encoding) { + case BZIP2: detail = "bzip2"; break; + case LZMA: detail = "lzma"; break; + case XZ: detail = "xz"; break; + default: detail = "??"; break; + } + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "%s compression not supported on this platform", + detail); + return (ARCHIVE_FAILED); + } + return (ARCHIVE_OK); +} + +static int +decompress(struct archive_read *a, const void **buff, size_t *outbytes, + const void *b, size_t *used) +{ + struct xar *xar; + void *outbuff; + size_t avail_in, avail_out; + int r; + + xar = (struct xar *)(a->format->data); + avail_in = *used; + outbuff = (void *)(uintptr_t)*buff; + if (outbuff == NULL) { + outbuff = xar->buff; + *buff = outbuff; + avail_out = sizeof(xar->buff); + } else + avail_out = *outbytes; + switch (xar->rd_encoding) { + case GZIP: + xar->stream.next_in = (Bytef *)(uintptr_t)b; + xar->stream.avail_in = avail_in; + xar->stream.next_out = (unsigned char *)outbuff; + xar->stream.avail_out = avail_out; + r = inflate(&(xar->stream), 0); + switch (r) { + case Z_OK: /* Decompressor made some progress.*/ + case Z_STREAM_END: /* Found end of stream. */ + break; + default: + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "File decompression failed (%d)", r); + return (ARCHIVE_FATAL); + } + *used = avail_in - xar->stream.avail_in; + *outbytes = avail_out - xar->stream.avail_out; + break; +#ifdef HAVE_BZLIB_H + case BZIP2: + xar->bzstream.next_in = (char *)(uintptr_t)b; + xar->bzstream.avail_in = avail_in; + xar->bzstream.next_out = (char *)outbuff; + xar->bzstream.avail_out = avail_out; + r = BZ2_bzDecompress(&(xar->bzstream)); + switch (r) { + case BZ_STREAM_END: /* Found end of stream. */ + switch (BZ2_bzDecompressEnd(&(xar->bzstream))) { + case BZ_OK: + break; + default: + archive_set_error(&(a->archive), + ARCHIVE_ERRNO_MISC, + "Failed to clean up decompressor"); + return (ARCHIVE_FATAL); + } + xar->bzstream_valid = 0; + /* FALLTHROUGH */ + case BZ_OK: /* Decompressor made some progress. */ + break; + default: + archive_set_error(&(a->archive), + ARCHIVE_ERRNO_MISC, + "bzip decompression failed"); + return (ARCHIVE_FATAL); + } + *used = avail_in - xar->bzstream.avail_in; + *outbytes = avail_out - xar->bzstream.avail_out; + break; +#endif +#if defined(HAVE_LZMA_H) && defined(HAVE_LIBLZMA) + case LZMA: + case XZ: + xar->lzstream.next_in = b; + xar->lzstream.avail_in = avail_in; + xar->lzstream.next_out = (unsigned char *)outbuff; + xar->lzstream.avail_out = avail_out; + r = lzma_code(&(xar->lzstream), LZMA_RUN); + switch (r) { + case LZMA_STREAM_END: /* Found end of stream. */ + lzma_end(&(xar->lzstream)); + xar->lzstream_valid = 0; + /* FALLTHROUGH */ + case LZMA_OK: /* Decompressor made some progress. */ + break; + default: + archive_set_error(&(a->archive), + ARCHIVE_ERRNO_MISC, + "%s decompression failed(%d)", + (xar->entry_encoding == XZ)?"xz":"lzma", + r); + return (ARCHIVE_FATAL); + } + *used = avail_in - xar->lzstream.avail_in; + *outbytes = avail_out - xar->lzstream.avail_out; + break; +#elif defined(HAVE_LZMADEC_H) && defined(HAVE_LIBLZMADEC) + case LZMA: + xar->lzstream.next_in = (unsigned char *)(uintptr_t)b; + xar->lzstream.avail_in = avail_in; + xar->lzstream.next_out = (unsigned char *)outbuff; + xar->lzstream.avail_out = avail_out; + r = lzmadec_decode(&(xar->lzstream), 0); + switch (r) { + case LZMADEC_STREAM_END: /* Found end of stream. */ + switch (lzmadec_end(&(xar->lzstream))) { + case LZMADEC_OK: + break; + default: + archive_set_error(&(a->archive), + ARCHIVE_ERRNO_MISC, + "Failed to clean up lzmadec decompressor"); + return (ARCHIVE_FATAL); + } + xar->lzstream_valid = 0; + /* FALLTHROUGH */ + case LZMADEC_OK: /* Decompressor made some progress. */ + break; + default: + archive_set_error(&(a->archive), + ARCHIVE_ERRNO_MISC, + "lzmadec decompression failed(%d)", + r); + return (ARCHIVE_FATAL); + } + *used = avail_in - xar->lzstream.avail_in; + *outbytes = avail_out - xar->lzstream.avail_out; + break; +#endif +#ifndef HAVE_BZLIB_H + case BZIP2: +#endif +#if !defined(HAVE_LZMA_H) || !defined(HAVE_LIBLZMA) +#if !defined(HAVE_LZMADEC_H) || !defined(HAVE_LIBLZMADEC) + case LZMA: +#endif + case XZ: +#endif + case NONE: + default: + if (outbuff == xar->buff) { + *buff = b; + *used = avail_in; + *outbytes = avail_in; + } else { + if (avail_out > avail_in) + avail_out = avail_in; + memcpy(outbuff, b, avail_out); + *used = avail_out; + *outbytes = avail_out; + } + break; + } + return (ARCHIVE_OK); +} + +static int +decompression_cleanup(struct archive_read *a) +{ + struct xar *xar; + int r; + + xar = (struct xar *)(a->format->data); + r = ARCHIVE_OK; + if (xar->stream_valid) { + if (inflateEnd(&(xar->stream)) != Z_OK) { + archive_set_error(&a->archive, + ARCHIVE_ERRNO_MISC, + "Failed to clean up zlib decompressor"); + r = ARCHIVE_FATAL; + } + } +#ifdef HAVE_BZLIB_H + if (xar->bzstream_valid) { + if (BZ2_bzDecompressEnd(&(xar->bzstream)) != BZ_OK) { + archive_set_error(&a->archive, + ARCHIVE_ERRNO_MISC, + "Failed to clean up bzip2 decompressor"); + r = ARCHIVE_FATAL; + } + } +#endif +#if defined(HAVE_LZMA_H) && defined(HAVE_LIBLZMA) + if (xar->lzstream_valid) + lzma_end(&(xar->lzstream)); +#elif defined(HAVE_LZMA_H) && defined(HAVE_LIBLZMA) + if (xar->lzstream_valid) { + if (lzmadec_end(&(xar->lzstream)) != LZMADEC_OK) { + archive_set_error(&a->archive, + ARCHIVE_ERRNO_MISC, + "Failed to clean up lzmadec decompressor"); + r = ARCHIVE_FATAL; + } + } +#endif + return (r); +} + +static void +xmlattr_cleanup(struct xmlattr_list *list) +{ + struct xmlattr *attr, *next; + + attr = list->first; + while (attr != NULL) { + next = attr->next; + free(attr->name); + free(attr->value); + free(attr); + attr = next; + } + list->first = NULL; + list->last = &(list->first); +} + +static void +file_new(struct xar *xar, struct xmlattr_list *list) +{ + struct xar_file *file; + struct xmlattr *attr; + + file = calloc(1, sizeof(*file)); + if (file == NULL) + __archive_errx(1, "Out of memory"); + file->parent = xar->file; + file->mode = 0777 | AE_IFREG; + file->atime = time(NULL); + file->mtime = time(NULL); + xar->file = file; + xar->xattr = NULL; + for (attr = list->first; attr != NULL; attr = attr->next) { + if (strcmp(attr->name, "id") == 0) + file->id = atol10(attr->value, strlen(attr->value)); + } + file->nlink = 1; + heap_add_entry(&(xar->file_queue), file); +} + +static void +file_free(struct xar_file *file) +{ + struct xattr *xattr; + + archive_string_free(&(file->pathname)); + archive_string_free(&(file->symlink)); + archive_string_free(&(file->uname)); + archive_string_free(&(file->gname)); + archive_string_free(&(file->hardlink)); + xattr = file->xattr_list; + while (xattr != NULL) { + struct xattr *next; + + next = xattr->next; + xattr_free(xattr); + xattr = next; + } + + free(file); +} + +static void +xattr_new(struct xar *xar, struct xmlattr_list *list) +{ + struct xattr *xattr, **nx; + struct xmlattr *attr; + + xattr = calloc(1, sizeof(*xattr)); + if (xattr == NULL) + __archive_errx(1, "Out of memory"); + xar->xattr = xattr; + for (attr = list->first; attr != NULL; attr = attr->next) { + if (strcmp(attr->name, "id") == 0) + xattr->id = atol10(attr->value, strlen(attr->value)); + } + /* Chain to xattr list. */ + for (nx = &(xar->file->xattr_list); + *nx != NULL; nx = &((*nx)->next)) { + if (xattr->id < (*nx)->id) + break; + } + xattr->next = *nx; + *nx = xattr; +} + +static void +xattr_free(struct xattr *xattr) +{ + archive_string_free(&(xattr->name)); + free(xattr); +} + +static int +getencoding(struct xmlattr_list *list) +{ + struct xmlattr *attr; + enum enctype encoding = NONE; + + for (attr = list->first; attr != NULL; attr = attr->next) { + if (strcmp(attr->name, "style") == 0) { + if (strcmp(attr->value, "application/octet-stream") == 0) + encoding = NONE; + else if (strcmp(attr->value, "application/x-gzip") == 0) + encoding = GZIP; + else if (strcmp(attr->value, "application/x-bzip2") == 0) + encoding = BZIP2; + else if (strcmp(attr->value, "application/x-lzma") == 0) + encoding = LZMA; + else if (strcmp(attr->value, "application/x-xz") == 0) + encoding = XZ; + } + } + return (encoding); +} + +static int +getsumalgorithm(struct xmlattr_list *list) +{ + struct xmlattr *attr; + int alg = CKSUM_NONE; + + for (attr = list->first; attr != NULL; attr = attr->next) { + if (strcmp(attr->name, "style") == 0) { + const char *v = attr->value; + if ((v[0] == 'S' || v[0] == 's') && + (v[1] == 'H' || v[1] == 'h') && + (v[2] == 'A' || v[2] == 'a') && + v[3] == '1' && v[4] == '\0') + alg = CKSUM_SHA1; + if ((v[0] == 'M' || v[0] == 'm') && + (v[1] == 'D' || v[1] == 'd') && + v[2] == '5' && v[3] == '\0') + alg = CKSUM_MD5; + } + } + return (alg); +} + +static void +unknowntag_start(struct xar *xar, const char *name) +{ + struct unknown_tag *tag; + +#if DEBUG + fprintf(stderr, "unknowntag_start:%s\n", name); +#endif + tag = malloc(sizeof(*tag)); + if (tag == NULL) + __archive_errx(1, "Out of memory"); + tag->next = xar->unknowntags; + archive_string_init(&(tag->name)); + archive_strcpy(&(tag->name), name); + if (xar->unknowntags == NULL) { + xar->xmlsts_unknown = xar->xmlsts; + xar->xmlsts = UNKNOWN; + } + xar->unknowntags = tag; +} + +static void +unknowntag_end(struct xar *xar, const char *name) +{ + struct unknown_tag *tag; + +#if DEBUG + fprintf(stderr, "unknowntag_end:%s\n", name); +#endif + tag = xar->unknowntags; + if (tag == NULL || name == NULL) + return; + if (strcmp(tag->name.s, name) == 0) { + xar->unknowntags = tag->next; + archive_string_free(&(tag->name)); + free(tag); + if (xar->unknowntags == NULL) + xar->xmlsts = xar->xmlsts_unknown; + } +} + +static void +xml_start(void *userData, const char *name, struct xmlattr_list *list) +{ + struct archive_read *a; + struct xar *xar; + struct xmlattr *attr; + + a = (struct archive_read *)userData; + xar = (struct xar *)(a->format->data); + +#if DEBUG + fprintf(stderr, "xml_sta:[%s]\n", name); + for (attr = list->first; attr != NULL; attr = attr->next) + fprintf(stderr, " attr:\"%s\"=\"%s\"\n", + attr->name, attr->value); +#endif + xar->base64text = 0; + switch (xar->xmlsts) { + case INIT: + if (strcmp(name, "xar") == 0) + xar->xmlsts = XAR; + else + unknowntag_start(xar, name); + break; + case XAR: + if (strcmp(name, "toc") == 0) + xar->xmlsts = TOC; + else + unknowntag_start(xar, name); + break; + case TOC: + if (strcmp(name, "creation-time") == 0) + xar->xmlsts = TOC_CREATION_TIME; + else if (strcmp(name, "checksum") == 0) + xar->xmlsts = TOC_CHECKSUM; + else if (strcmp(name, "file") == 0) { + file_new(xar, list); + xar->xmlsts = TOC_FILE; + } + else + unknowntag_start(xar, name); + break; + case TOC_CHECKSUM: + if (strcmp(name, "offset") == 0) + xar->xmlsts = TOC_CHECKSUM_OFFSET; + else if (strcmp(name, "size") == 0) + xar->xmlsts = TOC_CHECKSUM_SIZE; + else + unknowntag_start(xar, name); + break; + case TOC_FILE: + if (strcmp(name, "file") == 0) { + file_new(xar, list); + } + else if (strcmp(name, "data") == 0) + xar->xmlsts = FILE_DATA; + else if (strcmp(name, "ea") == 0) { + xattr_new(xar, list); + xar->xmlsts = FILE_EA; + } + else if (strcmp(name, "ctime") == 0) + xar->xmlsts = FILE_CTIME; + else if (strcmp(name, "mtime") == 0) + xar->xmlsts = FILE_MTIME; + else if (strcmp(name, "atime") == 0) + xar->xmlsts = FILE_ATIME; + else if (strcmp(name, "group") == 0) + xar->xmlsts = FILE_GROUP; + else if (strcmp(name, "gid") == 0) + xar->xmlsts = FILE_GID; + else if (strcmp(name, "user") == 0) + xar->xmlsts = FILE_USER; + else if (strcmp(name, "uid") == 0) + xar->xmlsts = FILE_UID; + else if (strcmp(name, "mode") == 0) + xar->xmlsts = FILE_MODE; + else if (strcmp(name, "device") == 0) + xar->xmlsts = FILE_DEVICE; + else if (strcmp(name, "deviceno") == 0) + xar->xmlsts = FILE_DEVICENO; + else if (strcmp(name, "inode") == 0) + xar->xmlsts = FILE_INODE; + else if (strcmp(name, "link") == 0) + xar->xmlsts = FILE_LINK; + else if (strcmp(name, "type") == 0) { + xar->xmlsts = FILE_TYPE; + for (attr = list->first; attr != NULL; + attr = attr->next) { + if (strcmp(attr->name, "link") != 0) + continue; + if (strcmp(attr->value, "original") == 0) { + xar->file->hdnext = xar->hdlink_orgs; + xar->hdlink_orgs = xar->file; + } else { + xar->file->link = atol10(attr->value, + strlen(attr->value)); + if (xar->file->link > 0) + add_link(xar, xar->file); + } + } + } + else if (strcmp(name, "name") == 0) { + xar->xmlsts = FILE_NAME; + for (attr = list->first; attr != NULL; + attr = attr->next) { + if (strcmp(attr->name, "enctype") == 0 && + strcmp(attr->value, "base64") == 0) + xar->base64text = 1; + } + } + else if (strcmp(name, "acl") == 0) + xar->xmlsts = FILE_ACL; + else if (strcmp(name, "flags") == 0) + xar->xmlsts = FILE_FLAGS; + else if (strcmp(name, "ext2") == 0) + xar->xmlsts = FILE_EXT2; + else + unknowntag_start(xar, name); + break; + case FILE_DATA: + if (strcmp(name, "length") == 0) + xar->xmlsts = FILE_DATA_LENGTH; + else if (strcmp(name, "offset") == 0) + xar->xmlsts = FILE_DATA_OFFSET; + else if (strcmp(name, "size") == 0) + xar->xmlsts = FILE_DATA_SIZE; + else if (strcmp(name, "encoding") == 0) { + xar->xmlsts = FILE_DATA_ENCODING; + xar->file->encoding = getencoding(list); + } + else if (strcmp(name, "archived-checksum") == 0) { + xar->xmlsts = FILE_DATA_A_CHECKSUM; + xar->file->a_sum.alg = getsumalgorithm(list); + } + else if (strcmp(name, "extracted-checksum") == 0) { + xar->xmlsts = FILE_DATA_E_CHECKSUM; + xar->file->e_sum.alg = getsumalgorithm(list); + } + else if (strcmp(name, "content") == 0) + xar->xmlsts = FILE_DATA_CONTENT; + else + unknowntag_start(xar, name); + break; + case FILE_DEVICE: + if (strcmp(name, "major") == 0) + xar->xmlsts = FILE_DEVICE_MAJOR; + else if (strcmp(name, "minor") == 0) + xar->xmlsts = FILE_DEVICE_MINOR; + else + unknowntag_start(xar, name); + break; + case FILE_DATA_CONTENT: + unknowntag_start(xar, name); + break; + case FILE_EA: + if (strcmp(name, "length") == 0) + xar->xmlsts = FILE_EA_LENGTH; + else if (strcmp(name, "offset") == 0) + xar->xmlsts = FILE_EA_OFFSET; + else if (strcmp(name, "size") == 0) + xar->xmlsts = FILE_EA_SIZE; + else if (strcmp(name, "encoding") == 0) { + xar->xmlsts = FILE_EA_ENCODING; + xar->xattr->encoding = getencoding(list); + } else if (strcmp(name, "archived-checksum") == 0) + xar->xmlsts = FILE_EA_A_CHECKSUM; + else if (strcmp(name, "extracted-checksum") == 0) + xar->xmlsts = FILE_EA_E_CHECKSUM; + else if (strcmp(name, "name") == 0) + xar->xmlsts = FILE_EA_NAME; + else if (strcmp(name, "fstype") == 0) + xar->xmlsts = FILE_EA_FSTYPE; + else + unknowntag_start(xar, name); + break; + case FILE_ACL: + if (strcmp(name, "appleextended") == 0) + xar->xmlsts = FILE_ACL_APPLEEXTENDED; + if (strcmp(name, "default") == 0) + xar->xmlsts = FILE_ACL_DEFAULT; + else if (strcmp(name, "access") == 0) + xar->xmlsts = FILE_ACL_ACCESS; + else + unknowntag_start(xar, name); + break; + case FILE_FLAGS: + if (!xml_parse_file_flags(xar, name)) + unknowntag_start(xar, name); + break; + case FILE_EXT2: + if (!xml_parse_file_ext2(xar, name)) + unknowntag_start(xar, name); + break; + case TOC_CREATION_TIME: + case TOC_CHECKSUM_OFFSET: + case TOC_CHECKSUM_SIZE: + case FILE_DATA_LENGTH: + case FILE_DATA_OFFSET: + case FILE_DATA_SIZE: + case FILE_DATA_ENCODING: + case FILE_DATA_A_CHECKSUM: + case FILE_DATA_E_CHECKSUM: + case FILE_EA_LENGTH: + case FILE_EA_OFFSET: + case FILE_EA_SIZE: + case FILE_EA_ENCODING: + case FILE_EA_A_CHECKSUM: + case FILE_EA_E_CHECKSUM: + case FILE_EA_NAME: + case FILE_EA_FSTYPE: + case FILE_CTIME: + case FILE_MTIME: + case FILE_ATIME: + case FILE_GROUP: + case FILE_GID: + case FILE_USER: + case FILE_UID: + case FILE_INODE: + case FILE_DEVICE_MAJOR: + case FILE_DEVICE_MINOR: + case FILE_DEVICENO: + case FILE_MODE: + case FILE_TYPE: + case FILE_LINK: + case FILE_NAME: + case FILE_ACL_DEFAULT: + case FILE_ACL_ACCESS: + case FILE_ACL_APPLEEXTENDED: + case FILE_FLAGS_USER_NODUMP: + case FILE_FLAGS_USER_IMMUTABLE: + case FILE_FLAGS_USER_APPEND: + case FILE_FLAGS_USER_OPAQUE: + case FILE_FLAGS_USER_NOUNLINK: + case FILE_FLAGS_SYS_ARCHIVED: + case FILE_FLAGS_SYS_IMMUTABLE: + case FILE_FLAGS_SYS_APPEND: + case FILE_FLAGS_SYS_NOUNLINK: + case FILE_FLAGS_SYS_SNAPSHOT: + case FILE_EXT2_SecureDeletion: + case FILE_EXT2_Undelete: + case FILE_EXT2_Compress: + case FILE_EXT2_Synchronous: + case FILE_EXT2_Immutable: + case FILE_EXT2_AppendOnly: + case FILE_EXT2_NoDump: + case FILE_EXT2_NoAtime: + case FILE_EXT2_CompDirty: + case FILE_EXT2_CompBlock: + case FILE_EXT2_NoCompBlock: + case FILE_EXT2_CompError: + case FILE_EXT2_BTree: + case FILE_EXT2_HashIndexed: + case FILE_EXT2_iMagic: + case FILE_EXT2_Journaled: + case FILE_EXT2_NoTail: + case FILE_EXT2_DirSync: + case FILE_EXT2_TopDir: + case FILE_EXT2_Reserved: + case UNKNOWN: + unknowntag_start(xar, name); + break; + } +} + +static void +xml_end(void *userData, const char *name) +{ + struct archive_read *a; + struct xar *xar; + + a = (struct archive_read *)userData; + xar = (struct xar *)(a->format->data); + +#if DEBUG + fprintf(stderr, "xml_end:[%s]\n", name); +#endif + switch (xar->xmlsts) { + case INIT: + break; + case XAR: + if (strcmp(name, "xar") == 0) + xar->xmlsts = INIT; + break; + case TOC: + if (strcmp(name, "toc") == 0) + xar->xmlsts = XAR; + break; + case TOC_CREATION_TIME: + if (strcmp(name, "creation-time") == 0) + xar->xmlsts = TOC; + break; + case TOC_CHECKSUM: + if (strcmp(name, "checksum") == 0) + xar->xmlsts = TOC; + break; + case TOC_CHECKSUM_OFFSET: + if (strcmp(name, "offset") == 0) + xar->xmlsts = TOC_CHECKSUM; + break; + case TOC_CHECKSUM_SIZE: + if (strcmp(name, "size") == 0) + xar->xmlsts = TOC_CHECKSUM; + break; + case TOC_FILE: + if (strcmp(name, "file") == 0) { + if (xar->file->parent != NULL && + ((xar->file->mode & AE_IFMT) == AE_IFDIR)) + xar->file->parent->subdirs++; + xar->file = xar->file->parent; + if (xar->file == NULL) + xar->xmlsts = TOC; + } + break; + case FILE_DATA: + if (strcmp(name, "data") == 0) + xar->xmlsts = TOC_FILE; + break; + case FILE_DATA_LENGTH: + if (strcmp(name, "length") == 0) + xar->xmlsts = FILE_DATA; + break; + case FILE_DATA_OFFSET: + if (strcmp(name, "offset") == 0) + xar->xmlsts = FILE_DATA; + break; + case FILE_DATA_SIZE: + if (strcmp(name, "size") == 0) + xar->xmlsts = FILE_DATA; + break; + case FILE_DATA_ENCODING: + if (strcmp(name, "encoding") == 0) + xar->xmlsts = FILE_DATA; + break; + case FILE_DATA_A_CHECKSUM: + if (strcmp(name, "archived-checksum") == 0) + xar->xmlsts = FILE_DATA; + break; + case FILE_DATA_E_CHECKSUM: + if (strcmp(name, "extracted-checksum") == 0) + xar->xmlsts = FILE_DATA; + break; + case FILE_DATA_CONTENT: + if (strcmp(name, "content") == 0) + xar->xmlsts = FILE_DATA; + break; + case FILE_EA: + if (strcmp(name, "ea") == 0) { + xar->xmlsts = TOC_FILE; + xar->xattr = NULL; + } + break; + case FILE_EA_LENGTH: + if (strcmp(name, "length") == 0) + xar->xmlsts = FILE_EA; + break; + case FILE_EA_OFFSET: + if (strcmp(name, "offset") == 0) + xar->xmlsts = FILE_EA; + break; + case FILE_EA_SIZE: + if (strcmp(name, "size") == 0) + xar->xmlsts = FILE_EA; + break; + case FILE_EA_ENCODING: + if (strcmp(name, "encoding") == 0) + xar->xmlsts = FILE_EA; + break; + case FILE_EA_A_CHECKSUM: + if (strcmp(name, "archived-checksum") == 0) + xar->xmlsts = FILE_EA; + break; + case FILE_EA_E_CHECKSUM: + if (strcmp(name, "extracted-checksum") == 0) + xar->xmlsts = FILE_EA; + break; + case FILE_EA_NAME: + if (strcmp(name, "name") == 0) + xar->xmlsts = FILE_EA; + break; + case FILE_EA_FSTYPE: + if (strcmp(name, "fstype") == 0) + xar->xmlsts = FILE_EA; + break; + case FILE_CTIME: + if (strcmp(name, "ctime") == 0) + xar->xmlsts = TOC_FILE; + break; + case FILE_MTIME: + if (strcmp(name, "mtime") == 0) + xar->xmlsts = TOC_FILE; + break; + case FILE_ATIME: + if (strcmp(name, "atime") == 0) + xar->xmlsts = TOC_FILE; + break; + case FILE_GROUP: + if (strcmp(name, "group") == 0) + xar->xmlsts = TOC_FILE; + break; + case FILE_GID: + if (strcmp(name, "gid") == 0) + xar->xmlsts = TOC_FILE; + break; + case FILE_USER: + if (strcmp(name, "user") == 0) + xar->xmlsts = TOC_FILE; + break; + case FILE_UID: + if (strcmp(name, "uid") == 0) + xar->xmlsts = TOC_FILE; + break; + case FILE_MODE: + if (strcmp(name, "mode") == 0) + xar->xmlsts = TOC_FILE; + break; + case FILE_DEVICE: + if (strcmp(name, "device") == 0) + xar->xmlsts = TOC_FILE; + break; + case FILE_DEVICE_MAJOR: + if (strcmp(name, "major") == 0) + xar->xmlsts = FILE_DEVICE; + break; + case FILE_DEVICE_MINOR: + if (strcmp(name, "minor") == 0) + xar->xmlsts = FILE_DEVICE; + break; + case FILE_DEVICENO: + if (strcmp(name, "deviceno") == 0) + xar->xmlsts = TOC_FILE; + break; + case FILE_INODE: + if (strcmp(name, "inode") == 0) + xar->xmlsts = TOC_FILE; + break; + case FILE_LINK: + if (strcmp(name, "link") == 0) + xar->xmlsts = TOC_FILE; + break; + case FILE_TYPE: + if (strcmp(name, "type") == 0) + xar->xmlsts = TOC_FILE; + break; + case FILE_NAME: + if (strcmp(name, "name") == 0) + xar->xmlsts = TOC_FILE; + break; + case FILE_ACL: + if (strcmp(name, "acl") == 0) + xar->xmlsts = TOC_FILE; + break; + case FILE_ACL_DEFAULT: + if (strcmp(name, "default") == 0) + xar->xmlsts = FILE_ACL; + break; + case FILE_ACL_ACCESS: + if (strcmp(name, "access") == 0) + xar->xmlsts = FILE_ACL; + break; + case FILE_ACL_APPLEEXTENDED: + if (strcmp(name, "appleextended") == 0) + xar->xmlsts = FILE_ACL; + break; + case FILE_FLAGS: + if (strcmp(name, "flags") == 0) + xar->xmlsts = TOC_FILE; + break; + case FILE_FLAGS_USER_NODUMP: + if (strcmp(name, "UserNoDump") == 0) + xar->xmlsts = FILE_FLAGS; + break; + case FILE_FLAGS_USER_IMMUTABLE: + if (strcmp(name, "UserImmutable") == 0) + xar->xmlsts = FILE_FLAGS; + break; + case FILE_FLAGS_USER_APPEND: + if (strcmp(name, "UserAppend") == 0) + xar->xmlsts = FILE_FLAGS; + break; + case FILE_FLAGS_USER_OPAQUE: + if (strcmp(name, "UserOpaque") == 0) + xar->xmlsts = FILE_FLAGS; + break; + case FILE_FLAGS_USER_NOUNLINK: + if (strcmp(name, "UserNoUnlink") == 0) + xar->xmlsts = FILE_FLAGS; + break; + case FILE_FLAGS_SYS_ARCHIVED: + if (strcmp(name, "SystemArchived") == 0) + xar->xmlsts = FILE_FLAGS; + break; + case FILE_FLAGS_SYS_IMMUTABLE: + if (strcmp(name, "SystemImmutable") == 0) + xar->xmlsts = FILE_FLAGS; + break; + case FILE_FLAGS_SYS_APPEND: + if (strcmp(name, "SystemAppend") == 0) + xar->xmlsts = FILE_FLAGS; + break; + case FILE_FLAGS_SYS_NOUNLINK: + if (strcmp(name, "SystemNoUnlink") == 0) + xar->xmlsts = FILE_FLAGS; + break; + case FILE_FLAGS_SYS_SNAPSHOT: + if (strcmp(name, "SystemSnapshot") == 0) + xar->xmlsts = FILE_FLAGS; + break; + case FILE_EXT2: + if (strcmp(name, "ext2") == 0) + xar->xmlsts = TOC_FILE; + break; + case FILE_EXT2_SecureDeletion: + if (strcmp(name, "SecureDeletion") == 0) + xar->xmlsts = FILE_EXT2; + break; + case FILE_EXT2_Undelete: + if (strcmp(name, "Undelete") == 0) + xar->xmlsts = FILE_EXT2; + break; + case FILE_EXT2_Compress: + if (strcmp(name, "Compress") == 0) + xar->xmlsts = FILE_EXT2; + break; + case FILE_EXT2_Synchronous: + if (strcmp(name, "Synchronous") == 0) + xar->xmlsts = FILE_EXT2; + break; + case FILE_EXT2_Immutable: + if (strcmp(name, "Immutable") == 0) + xar->xmlsts = FILE_EXT2; + break; + case FILE_EXT2_AppendOnly: + if (strcmp(name, "AppendOnly") == 0) + xar->xmlsts = FILE_EXT2; + break; + case FILE_EXT2_NoDump: + if (strcmp(name, "NoDump") == 0) + xar->xmlsts = FILE_EXT2; + break; + case FILE_EXT2_NoAtime: + if (strcmp(name, "NoAtime") == 0) + xar->xmlsts = FILE_EXT2; + break; + case FILE_EXT2_CompDirty: + if (strcmp(name, "CompDirty") == 0) + xar->xmlsts = FILE_EXT2; + break; + case FILE_EXT2_CompBlock: + if (strcmp(name, "CompBlock") == 0) + xar->xmlsts = FILE_EXT2; + break; + case FILE_EXT2_NoCompBlock: + if (strcmp(name, "NoCompBlock") == 0) + xar->xmlsts = FILE_EXT2; + break; + case FILE_EXT2_CompError: + if (strcmp(name, "CompError") == 0) + xar->xmlsts = FILE_EXT2; + break; + case FILE_EXT2_BTree: + if (strcmp(name, "BTree") == 0) + xar->xmlsts = FILE_EXT2; + break; + case FILE_EXT2_HashIndexed: + if (strcmp(name, "HashIndexed") == 0) + xar->xmlsts = FILE_EXT2; + break; + case FILE_EXT2_iMagic: + if (strcmp(name, "iMagic") == 0) + xar->xmlsts = FILE_EXT2; + break; + case FILE_EXT2_Journaled: + if (strcmp(name, "Journaled") == 0) + xar->xmlsts = FILE_EXT2; + break; + case FILE_EXT2_NoTail: + if (strcmp(name, "NoTail") == 0) + xar->xmlsts = FILE_EXT2; + break; + case FILE_EXT2_DirSync: + if (strcmp(name, "DirSync") == 0) + xar->xmlsts = FILE_EXT2; + break; + case FILE_EXT2_TopDir: + if (strcmp(name, "TopDir") == 0) + xar->xmlsts = FILE_EXT2; + break; + case FILE_EXT2_Reserved: + if (strcmp(name, "Reserved") == 0) + xar->xmlsts = FILE_EXT2; + break; + case UNKNOWN: + unknowntag_end(xar, name); + break; + } +} + +static const int base64[256] = { + -1, -1, -1, -1, -1, -1, -1, -1, + -1, -1, -1, -1, -1, -1, -1, -1, /* 00 - 0F */ + -1, -1, -1, -1, -1, -1, -1, -1, + -1, -1, -1, -1, -1, -1, -1, -1, /* 10 - 1F */ + -1, -1, -1, -1, -1, -1, -1, -1, + -1, -1, -1, 62, -1, -1, -1, 63, /* 20 - 2F */ + 52, 53, 54, 55, 56, 57, 58, 59, + 60, 61, -1, -1, -1, -1, -1, -1, /* 30 - 3F */ + -1, 0, 1, 2, 3, 4, 5, 6, + 7, 8, 9, 10, 11, 12, 13, 14, /* 40 - 4F */ + 15, 16, 17, 18, 19, 20, 21, 22, + 23, 24, 25, -1, -1, -1, -1, -1, /* 50 - 5F */ + -1, 26, 27, 28, 29, 30, 31, 32, + 33, 34, 35, 36, 37, 38, 39, 40, /* 60 - 6F */ + 41, 42, 43, 44, 45, 46, 47, 48, + 49, 50, 51, -1, -1, -1, -1, -1, /* 70 - 7F */ + -1, -1, -1, -1, -1, -1, -1, -1, + -1, -1, -1, -1, -1, -1, -1, -1, /* 80 - 8F */ + -1, -1, -1, -1, -1, -1, -1, -1, + -1, -1, -1, -1, -1, -1, -1, -1, /* 90 - 9F */ + -1, -1, -1, -1, -1, -1, -1, -1, + -1, -1, -1, -1, -1, -1, -1, -1, /* A0 - AF */ + -1, -1, -1, -1, -1, -1, -1, -1, + -1, -1, -1, -1, -1, -1, -1, -1, /* B0 - BF */ + -1, -1, -1, -1, -1, -1, -1, -1, + -1, -1, -1, -1, -1, -1, -1, -1, /* C0 - CF */ + -1, -1, -1, -1, -1, -1, -1, -1, + -1, -1, -1, -1, -1, -1, -1, -1, /* D0 - DF */ + -1, -1, -1, -1, -1, -1, -1, -1, + -1, -1, -1, -1, -1, -1, -1, -1, /* E0 - EF */ + -1, -1, -1, -1, -1, -1, -1, -1, + -1, -1, -1, -1, -1, -1, -1, -1, /* F0 - FF */ +}; + +static void +strappend_base64(struct archive_string *as, const char *s, size_t l) +{ + unsigned char buff[256]; + unsigned char *out; + const unsigned char *b; + size_t len; + + len = 0; + out = buff; + b = (const unsigned char *)s; + while (l > 0) { + int n = 0; + + if (l > 0) { + if (base64[b[0]] < 0 || base64[b[1]] < 0) + break; + n = base64[*b++] << 18; + n |= base64[*b++] << 12; + *out++ = n >> 16; + len++; + l -= 2; + } + if (l > 0) { + if (base64[*b] < 0) + break; + n |= base64[*b++] << 6; + *out++ = (n >> 8) & 0xFF; + len++; + --l; + } + if (l > 0) { + if (base64[*b] < 0) + break; + n |= base64[*b++]; + *out++ = n & 0xFF; + len++; + --l; + } + if (len+3 >= sizeof(buff)) { + archive_strncat(as, (const char *)buff, len); + len = 0; + out = buff; + } + } + if (len > 0) + archive_strncat(as, (const char *)buff, len); +} + +static void +xml_data(void *userData, const char *s, int len) +{ + struct archive_read *a; + struct xar *xar; + + a = (struct archive_read *)userData; + xar = (struct xar *)(a->format->data); + +#if DEBUG + { + char buff[1024]; + if (len > sizeof(buff)-1) + len = sizeof(buff)-1; + memcpy(buff, s, len); + buff[len] = 0; + fprintf(stderr, "\tlen=%d:\"%s\"\n", len, buff); + } +#endif + switch (xar->xmlsts) { + case TOC_CHECKSUM_OFFSET: + xar->toc_chksum_offset = atol10(s, len); + break; + case TOC_CHECKSUM_SIZE: + xar->toc_chksum_size = atol10(s, len); + break; + default: + break; + } + if (xar->file == NULL) + return; + + switch (xar->xmlsts) { + case FILE_NAME: + if (xar->file->parent != NULL) { + archive_string_concat(&(xar->file->pathname), + &(xar->file->parent->pathname)); + archive_strappend_char(&(xar->file->pathname), '/'); + } + xar->file->has |= HAS_PATHNAME; + if (xar->base64text) + strappend_base64(&(xar->file->pathname), s, len); + else + archive_strncat(&(xar->file->pathname), s, len); + break; + case FILE_LINK: + xar->file->has |= HAS_SYMLINK; + archive_strncpy(&(xar->file->symlink), s, len); + break; + case FILE_TYPE: + if (strncmp("file", s, len) == 0 || + strncmp("hardlink", s, len) == 0) + xar->file->mode = + (xar->file->mode & ~AE_IFMT) | AE_IFREG; + if (strncmp("directory", s, len) == 0) + xar->file->mode = + (xar->file->mode & ~AE_IFMT) | AE_IFDIR; + if (strncmp("symlink", s, len) == 0) + xar->file->mode = + (xar->file->mode & ~AE_IFMT) | AE_IFLNK; + if (strncmp("character special", s, len) == 0) + xar->file->mode = + (xar->file->mode & ~AE_IFMT) | AE_IFCHR; + if (strncmp("block special", s, len) == 0) + xar->file->mode = + (xar->file->mode & ~AE_IFMT) | AE_IFBLK; + if (strncmp("socket", s, len) == 0) + xar->file->mode = + (xar->file->mode & ~AE_IFMT) | AE_IFSOCK; + if (strncmp("fifo", s, len) == 0) + xar->file->mode = + (xar->file->mode & ~AE_IFMT) | AE_IFIFO; + xar->file->has |= HAS_TYPE; + break; + case FILE_INODE: + xar->file->has |= HAS_INO; + xar->file->ino64 = atol10(s, len); + break; + case FILE_DEVICE_MAJOR: + xar->file->has |= HAS_DEVMAJOR; + xar->file->devmajor = (dev_t)atol10(s, len); + break; + case FILE_DEVICE_MINOR: + xar->file->has |= HAS_DEVMINOR; + xar->file->devminor = (dev_t)atol10(s, len); + break; + case FILE_DEVICENO: + xar->file->has |= HAS_DEV; + xar->file->dev = (dev_t)atol10(s, len); + break; + case FILE_MODE: + xar->file->has |= HAS_MODE; + xar->file->mode = + (xar->file->mode & AE_IFMT) | + (atol8(s, len) & ~AE_IFMT); + break; + case FILE_GROUP: + xar->file->has |= HAS_GID; + archive_strncpy(&(xar->file->gname), s, len); + break; + case FILE_GID: + xar->file->has |= HAS_GID; + xar->file->gid = atol10(s, len); + break; + case FILE_USER: + xar->file->has |= HAS_UID; + archive_strncpy(&(xar->file->uname), s, len); + break; + case FILE_UID: + xar->file->has |= HAS_UID; + xar->file->uid = atol10(s, len); + break; + case FILE_CTIME: + xar->file->has |= HAS_TIME; + xar->file->ctime = parse_time(s, len); + break; + case FILE_MTIME: + xar->file->has |= HAS_TIME; + xar->file->mtime = parse_time(s, len); + break; + case FILE_ATIME: + xar->file->has |= HAS_TIME; + xar->file->atime = parse_time(s, len); + break; + case FILE_DATA_LENGTH: + xar->file->has |= HAS_DATA; + xar->file->length = atol10(s, len); + break; + case FILE_DATA_OFFSET: + xar->file->has |= HAS_DATA; + xar->file->offset = atol10(s, len); + break; + case FILE_DATA_SIZE: + xar->file->has |= HAS_DATA; + xar->file->size = atol10(s, len); + break; + case FILE_DATA_A_CHECKSUM: + xar->file->a_sum.len = atohex(xar->file->a_sum.val, + sizeof(xar->file->a_sum.val), s, len); + break; + case FILE_DATA_E_CHECKSUM: + xar->file->e_sum.len = atohex(xar->file->e_sum.val, + sizeof(xar->file->e_sum.val), s, len); + break; + case FILE_EA_LENGTH: + xar->file->has |= HAS_XATTR; + xar->xattr->length = atol10(s, len); + break; + case FILE_EA_OFFSET: + xar->file->has |= HAS_XATTR; + xar->xattr->offset = atol10(s, len); + break; + case FILE_EA_SIZE: + xar->file->has |= HAS_XATTR; + xar->xattr->size = atol10(s, len); + break; + case FILE_EA_A_CHECKSUM: + xar->file->has |= HAS_XATTR; + xar->xattr->a_sum.len = atohex(xar->xattr->a_sum.val, + sizeof(xar->xattr->a_sum.val), s, len); + break; + case FILE_EA_E_CHECKSUM: + xar->file->has |= HAS_XATTR; + xar->xattr->e_sum.len = atohex(xar->xattr->e_sum.val, + sizeof(xar->xattr->e_sum.val), s, len); + break; + case FILE_EA_NAME: + xar->file->has |= HAS_XATTR; + archive_strncpy(&(xar->xattr->name), s, len); + break; + case FILE_EA_FSTYPE: + xar->file->has |= HAS_XATTR; + archive_strncpy(&(xar->xattr->fstype), s, len); + break; + break; + case FILE_ACL_DEFAULT: + case FILE_ACL_ACCESS: + case FILE_ACL_APPLEEXTENDED: + xar->file->has |= HAS_ACL; + /* TODO */ + break; + case INIT: + case XAR: + case TOC: + case TOC_CREATION_TIME: + case TOC_CHECKSUM: + case TOC_CHECKSUM_OFFSET: + case TOC_CHECKSUM_SIZE: + case TOC_FILE: + case FILE_DATA: + case FILE_DATA_ENCODING: + case FILE_DATA_CONTENT: + case FILE_DEVICE: + case FILE_EA: + case FILE_EA_ENCODING: + case FILE_ACL: + case FILE_FLAGS: + case FILE_FLAGS_USER_NODUMP: + case FILE_FLAGS_USER_IMMUTABLE: + case FILE_FLAGS_USER_APPEND: + case FILE_FLAGS_USER_OPAQUE: + case FILE_FLAGS_USER_NOUNLINK: + case FILE_FLAGS_SYS_ARCHIVED: + case FILE_FLAGS_SYS_IMMUTABLE: + case FILE_FLAGS_SYS_APPEND: + case FILE_FLAGS_SYS_NOUNLINK: + case FILE_FLAGS_SYS_SNAPSHOT: + case FILE_EXT2: + case FILE_EXT2_SecureDeletion: + case FILE_EXT2_Undelete: + case FILE_EXT2_Compress: + case FILE_EXT2_Synchronous: + case FILE_EXT2_Immutable: + case FILE_EXT2_AppendOnly: + case FILE_EXT2_NoDump: + case FILE_EXT2_NoAtime: + case FILE_EXT2_CompDirty: + case FILE_EXT2_CompBlock: + case FILE_EXT2_NoCompBlock: + case FILE_EXT2_CompError: + case FILE_EXT2_BTree: + case FILE_EXT2_HashIndexed: + case FILE_EXT2_iMagic: + case FILE_EXT2_Journaled: + case FILE_EXT2_NoTail: + case FILE_EXT2_DirSync: + case FILE_EXT2_TopDir: + case FILE_EXT2_Reserved: + case UNKNOWN: + break; + } +} + +/* + * BSD file flags. + */ +static int +xml_parse_file_flags(struct xar *xar, const char *name) +{ + const char *flag = NULL; + + if (strcmp(name, "UserNoDump") == 0) { + xar->xmlsts = FILE_FLAGS_USER_NODUMP; + flag = "nodump"; + } + else if (strcmp(name, "UserImmutable") == 0) { + xar->xmlsts = FILE_FLAGS_USER_IMMUTABLE; + flag = "uimmutable"; + } + else if (strcmp(name, "UserAppend") == 0) { + xar->xmlsts = FILE_FLAGS_USER_APPEND; + flag = "uappend"; + } + else if (strcmp(name, "UserOpaque") == 0) { + xar->xmlsts = FILE_FLAGS_USER_OPAQUE; + flag = "opaque"; + } + else if (strcmp(name, "UserNoUnlink") == 0) { + xar->xmlsts = FILE_FLAGS_USER_NOUNLINK; + flag = "nouunlink"; + } + else if (strcmp(name, "SystemArchived") == 0) { + xar->xmlsts = FILE_FLAGS_SYS_ARCHIVED; + flag = "archived"; + } + else if (strcmp(name, "SystemImmutable") == 0) { + xar->xmlsts = FILE_FLAGS_SYS_IMMUTABLE; + flag = "simmutable"; + } + else if (strcmp(name, "SystemAppend") == 0) { + xar->xmlsts = FILE_FLAGS_SYS_APPEND; + flag = "sappend"; + } + else if (strcmp(name, "SystemNoUnlink") == 0) { + xar->xmlsts = FILE_FLAGS_SYS_NOUNLINK; + flag = "nosunlink"; + } + else if (strcmp(name, "SystemSnapshot") == 0) { + xar->xmlsts = FILE_FLAGS_SYS_SNAPSHOT; + flag = "snapshot"; + } + + if (flag == NULL) + return (0); + xar->file->has |= HAS_FFLAGS; + if (archive_strlen(&(xar->file->fflags_text)) > 0) + archive_strappend_char(&(xar->file->fflags_text), ','); + archive_strcat(&(xar->file->fflags_text), flag); + return (1); +} + +/* + * Linux file flags. + */ +static int +xml_parse_file_ext2(struct xar *xar, const char *name) +{ + const char *flag = NULL; + + if (strcmp(name, "SecureDeletion") == 0) { + xar->xmlsts = FILE_EXT2_SecureDeletion; + flag = "securedeletion"; + } + else if (strcmp(name, "Undelete") == 0) { + xar->xmlsts = FILE_EXT2_Undelete; + flag = "nouunlink"; + } + else if (strcmp(name, "Compress") == 0) { + xar->xmlsts = FILE_EXT2_Compress; + flag = "compress"; + } + else if (strcmp(name, "Synchronous") == 0) { + xar->xmlsts = FILE_EXT2_Synchronous; + flag = "sync"; + } + else if (strcmp(name, "Immutable") == 0) { + xar->xmlsts = FILE_EXT2_Immutable; + flag = "simmutable"; + } + else if (strcmp(name, "AppendOnly") == 0) { + xar->xmlsts = FILE_EXT2_AppendOnly; + flag = "sappend"; + } + else if (strcmp(name, "NoDump") == 0) { + xar->xmlsts = FILE_EXT2_NoDump; + flag = "nodump"; + } + else if (strcmp(name, "NoAtime") == 0) { + xar->xmlsts = FILE_EXT2_NoAtime; + flag = "noatime"; + } + else if (strcmp(name, "CompDirty") == 0) { + xar->xmlsts = FILE_EXT2_CompDirty; + flag = "compdirty"; + } + else if (strcmp(name, "CompBlock") == 0) { + xar->xmlsts = FILE_EXT2_CompBlock; + flag = "comprblk"; + } + else if (strcmp(name, "NoCompBlock") == 0) { + xar->xmlsts = FILE_EXT2_NoCompBlock; + flag = "nocomprblk"; + } + else if (strcmp(name, "CompError") == 0) { + xar->xmlsts = FILE_EXT2_CompError; + flag = "comperr"; + } + else if (strcmp(name, "BTree") == 0) { + xar->xmlsts = FILE_EXT2_BTree; + flag = "btree"; + } + else if (strcmp(name, "HashIndexed") == 0) { + xar->xmlsts = FILE_EXT2_HashIndexed; + flag = "hashidx"; + } + else if (strcmp(name, "iMagic") == 0) { + xar->xmlsts = FILE_EXT2_iMagic; + flag = "imagic"; + } + else if (strcmp(name, "Journaled") == 0) { + xar->xmlsts = FILE_EXT2_Journaled; + flag = "journal"; + } + else if (strcmp(name, "NoTail") == 0) { + xar->xmlsts = FILE_EXT2_NoTail; + flag = "notail"; + } + else if (strcmp(name, "DirSync") == 0) { + xar->xmlsts = FILE_EXT2_DirSync; + flag = "dirsync"; + } + else if (strcmp(name, "TopDir") == 0) { + xar->xmlsts = FILE_EXT2_TopDir; + flag = "topdir"; + } + else if (strcmp(name, "Reserved") == 0) { + xar->xmlsts = FILE_EXT2_Reserved; + flag = "reserved"; + } + + if (flag == NULL) + return (0); + if (archive_strlen(&(xar->file->fflags_text)) > 0) + archive_strappend_char(&(xar->file->fflags_text), ','); + archive_strcat(&(xar->file->fflags_text), flag); + return (1); +} + +#ifdef HAVE_LIBXML_XMLREADER_H + +static int +xml2_xmlattr_setup(struct xmlattr_list *list, xmlTextReaderPtr reader) +{ + struct xmlattr *attr; + int r; + + list->first = NULL; + list->last = &(list->first); + r = xmlTextReaderMoveToFirstAttribute(reader); + while (r == 1) { + attr = malloc(sizeof*(attr)); + if (attr == NULL) + __archive_errx(1, "Out of memory"); + attr->name = strdup( + (const char *)xmlTextReaderConstLocalName(reader)); + if (attr->name == NULL) + __archive_errx(1, "Out of memory"); + attr->value = strdup( + (const char *)xmlTextReaderConstValue(reader)); + if (attr->value == NULL) + __archive_errx(1, "Out of memory"); + attr->next = NULL; + *list->last = attr; + list->last = &(attr->next); + r = xmlTextReaderMoveToNextAttribute(reader); + } + return (r); +} + +static int +xml2_read_cb(void *context, char *buffer, int len) +{ + struct archive_read *a; + struct xar *xar; + const void *d; + size_t outbytes; + size_t used; + int r; + + a = (struct archive_read *)context; + xar = (struct xar *)(a->format->data); + + if (xar->toc_remaining <= 0) + return (0); + d = buffer; + outbytes = len; + r = rd_contents(a, &d, &outbytes, &used, xar->toc_remaining); + if (r != ARCHIVE_OK) + return (r); + __archive_read_consume(a, used); + xar->toc_remaining -= used; + xar->offset += used; + xar->toc_total += outbytes; + PRINT_TOC(buffer, len); + + return ((int)outbytes); +} + +static int +xml2_close_cb(void *context) +{ + + (void)context; /* UNUSED */ + return (0); +} + +static void +xml2_error_hdr(void *arg, const char *msg, xmlParserSeverities severity, + xmlTextReaderLocatorPtr locator) +{ + struct archive_read *a; + + (void)locator; /* UNUSED */ + a = (struct archive_read *)arg; + switch (severity) { + case XML_PARSER_SEVERITY_VALIDITY_WARNING: + case XML_PARSER_SEVERITY_WARNING: + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "XML Parsing error: %s", msg); + break; + case XML_PARSER_SEVERITY_VALIDITY_ERROR: + case XML_PARSER_SEVERITY_ERROR: + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "XML Parsing error: %s", msg); + break; + } +} + +static int +xml2_read_toc(struct archive_read *a) +{ + xmlTextReaderPtr reader; + struct xmlattr_list list; + int r; + + reader = xmlReaderForIO(xml2_read_cb, xml2_close_cb, a, NULL, NULL, 0); + if (reader == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Couldn't allocate memory for xml parser"); + return (ARCHIVE_FATAL); + } + xmlTextReaderSetErrorHandler(reader, xml2_error_hdr, a); + + while ((r = xmlTextReaderRead(reader)) == 1) { + const char *name, *value; + int type, empty; + + type = xmlTextReaderNodeType(reader); + name = (const char *)xmlTextReaderConstLocalName(reader); + switch (type) { + case XML_READER_TYPE_ELEMENT: + empty = xmlTextReaderIsEmptyElement(reader); + r = xml2_xmlattr_setup(&list, reader); + if (r == 0) { + xml_start(a, name, &list); + xmlattr_cleanup(&list); + if (empty) + xml_end(a, name); + } + break; + case XML_READER_TYPE_END_ELEMENT: + xml_end(a, name); + break; + case XML_READER_TYPE_TEXT: + value = (const char *)xmlTextReaderConstValue(reader); + xml_data(a, value, strlen(value)); + break; + case XML_READER_TYPE_SIGNIFICANT_WHITESPACE: + default: + break; + } + if (r < 0) + break; + } + xmlFreeTextReader(reader); + xmlCleanupParser(); + + return ((r == 0)?ARCHIVE_OK:ARCHIVE_FATAL); +} + +#elif defined(HAVE_BSDXML_H) || defined(HAVE_EXPAT_H) + +static void +expat_xmlattr_setup(struct xmlattr_list *list, const XML_Char **atts) +{ + struct xmlattr *attr; + + list->first = NULL; + list->last = &(list->first); + if (atts == NULL) + return; + while (atts[0] != NULL && atts[1] != NULL) { + attr = malloc(sizeof*(attr)); + if (attr == NULL) + __archive_errx(1, "Out of memory"); + attr->name = strdup(atts[0]); + if (attr->name == NULL) + __archive_errx(1, "Out of memory"); + attr->value = strdup(atts[1]); + if (attr->value == NULL) + __archive_errx(1, "Out of memory"); + attr->next = NULL; + *list->last = attr; + list->last = &(attr->next); + atts += 2; + } +} + +static void +expat_start_cb(void *userData, const XML_Char *name, const XML_Char **atts) +{ + struct xmlattr_list list; + + expat_xmlattr_setup(&list, atts); + xml_start(userData, (const char *)name, &list); + xmlattr_cleanup(&list); +} + +static void +expat_end_cb(void *userData, const XML_Char *name) +{ + xml_end(userData, (const char *)name); +} + +static void +expat_data_cb(void *userData, const XML_Char *s, int len) +{ + xml_data(userData, s, len); +} + +static int +expat_read_toc(struct archive_read *a) +{ + struct xar *xar; + XML_Parser parser; + + xar = (struct xar *)(a->format->data); + + /* Initialize XML Parser library. */ + parser = XML_ParserCreate(NULL); + if (parser == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Couldn't allocate memory for xml parser"); + return (ARCHIVE_FATAL); + } + XML_SetUserData(parser, a); + XML_SetElementHandler(parser, expat_start_cb, expat_end_cb); + XML_SetCharacterDataHandler(parser, expat_data_cb); + xar->xmlsts = INIT; + + while (xar->toc_remaining) { + enum XML_Status xr; + const void *d; + size_t outbytes; + size_t used; + int r; + + d = NULL; + r = rd_contents(a, &d, &outbytes, &used, xar->toc_remaining); + if (r != ARCHIVE_OK) + return (r); + __archive_read_consume(a, used); + xar->toc_remaining -= used; + xar->offset += used; + xar->toc_total += outbytes; + PRINT_TOC(d, outbytes); + + xr = XML_Parse(parser, d, outbytes, xar->toc_remaining == 0); + if (xr == XML_STATUS_ERROR) { + XML_ParserFree(parser); + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "XML Parsing failed"); + return (ARCHIVE_FATAL); + } + } + XML_ParserFree(parser); + return (ARCHIVE_OK); +} +#endif /* defined(HAVE_BSDXML_H) || defined(HAVE_EXPAT_H) */ + +#endif /* Support xar format */ diff --git a/lib/libarchive/archive_read_support_format_zip.c b/lib/libarchive/archive_read_support_format_zip.c new file mode 100644 index 000000000..8cf7f0dcf --- /dev/null +++ b/lib/libarchive/archive_read_support_format_zip.c @@ -0,0 +1,926 @@ +/*- + * Copyright (c) 2004 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_read_support_format_zip.c 201102 2009-12-28 03:11:36Z kientzle $"); + +#ifdef HAVE_ERRNO_H +#include +#endif +#include +#ifdef HAVE_STDLIB_H +#include +#endif +#include +#ifdef HAVE_ZLIB_H +#include +#endif + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" +#include "archive_read_private.h" +#include "archive_endian.h" + +#ifndef HAVE_ZLIB_H +#include "archive_crc32.h" +#endif + +struct zip { + /* entry_bytes_remaining is the number of bytes we expect. */ +#ifndef __minix + int64_t entry_bytes_remaining; + int64_t entry_offset; +#else + size_t entry_bytes_remaining; + off_t entry_offset; +#endif + /* These count the number of bytes actually read for the entry. */ +#ifndef __minix + int64_t entry_compressed_bytes_read; + int64_t entry_uncompressed_bytes_read; +#else + size_t entry_compressed_bytes_read; + size_t entry_uncompressed_bytes_read; +#endif + /* Running CRC32 of the decompressed data */ + unsigned long entry_crc32; + + unsigned version; + unsigned system; + unsigned flags; + unsigned compression; + const char * compression_name; + time_t mtime; + time_t ctime; + time_t atime; + mode_t mode; + uid_t uid; + gid_t gid; + + /* Flags to mark progress of decompression. */ + char decompress_init; + char end_of_entry; + + unsigned long crc32; + ssize_t filename_length; + ssize_t extra_length; +#ifndef __minix + int64_t uncompressed_size; + int64_t compressed_size; +#else + size_t uncompressed_size; + size_t compressed_size; +#endif + unsigned char *uncompressed_buffer; + size_t uncompressed_buffer_size; +#ifdef HAVE_ZLIB_H + z_stream stream; + char stream_valid; +#endif + + struct archive_string pathname; + struct archive_string extra; + char format_name[64]; +}; + +#define ZIP_LENGTH_AT_END 8 + +struct zip_file_header { + char signature[4]; + char version[2]; + char flags[2]; + char compression[2]; + char timedate[4]; + char crc32[4]; + char compressed_size[4]; + char uncompressed_size[4]; + char filename_length[2]; + char extra_length[2]; +}; + +static const char *compression_names[] = { + "uncompressed", + "shrinking", + "reduced-1", + "reduced-2", + "reduced-3", + "reduced-4", + "imploded", + "reserved", + "deflation" +}; + +static int archive_read_format_zip_bid(struct archive_read *); +static int archive_read_format_zip_cleanup(struct archive_read *); +static int archive_read_format_zip_read_data(struct archive_read *, + const void **, size_t *, off_t *); +static int archive_read_format_zip_read_data_skip(struct archive_read *a); +static int archive_read_format_zip_read_header(struct archive_read *, + struct archive_entry *); +static int zip_read_data_deflate(struct archive_read *a, const void **buff, + size_t *size, off_t *offset); +static int zip_read_data_none(struct archive_read *a, const void **buff, + size_t *size, off_t *offset); +static int zip_read_file_header(struct archive_read *a, + struct archive_entry *entry, struct zip *zip); +static time_t zip_time(const char *); +static void process_extra(const void* extra, struct zip* zip); + +int +archive_read_support_format_zip(struct archive *_a) +{ + struct archive_read *a = (struct archive_read *)_a; + struct zip *zip; + int r; + + zip = (struct zip *)malloc(sizeof(*zip)); + if (zip == NULL) { + archive_set_error(&a->archive, ENOMEM, "Can't allocate zip data"); + return (ARCHIVE_FATAL); + } + memset(zip, 0, sizeof(*zip)); + + r = __archive_read_register_format(a, + zip, + "zip", + archive_read_format_zip_bid, + NULL, + archive_read_format_zip_read_header, + archive_read_format_zip_read_data, + archive_read_format_zip_read_data_skip, + archive_read_format_zip_cleanup); + + if (r != ARCHIVE_OK) + free(zip); + return (ARCHIVE_OK); +} + + +static int +archive_read_format_zip_bid(struct archive_read *a) +{ + const char *p; + const void *buff; + ssize_t bytes_avail, offset; + + if ((p = __archive_read_ahead(a, 4, NULL)) == NULL) + return (-1); + + /* + * Bid of 30 here is: 16 bits for "PK", + * next 16-bit field has four options (-2 bits). + * 16 + 16-2 = 30. + */ + if (p[0] == 'P' && p[1] == 'K') { + if ((p[2] == '\001' && p[3] == '\002') + || (p[2] == '\003' && p[3] == '\004') + || (p[2] == '\005' && p[3] == '\006') + || (p[2] == '\007' && p[3] == '\010') + || (p[2] == '0' && p[3] == '0')) + return (30); + } + + /* + * Attempt to handle self-extracting archives + * by noting a PE header and searching forward + * up to 128k for a 'PK\003\004' marker. + */ + if (p[0] == 'M' && p[1] == 'Z') { + /* + * TODO: Optimize by initializing 'offset' to an + * estimate of the likely start of the archive data + * based on values in the PE header. Note that we + * don't need to be exact, but we mustn't skip too + * far. The search below will compensate if we + * undershoot. + */ + offset = 0; + while (offset < 124000) { + /* Get 4k of data beyond where we stopped. */ + buff = __archive_read_ahead(a, offset + 4096, + &bytes_avail); + if (buff == NULL) + break; + p = (const char *)buff + offset; + while (p + 9 < (const char *)buff + bytes_avail) { + if (p[0] == 'P' && p[1] == 'K' /* signature */ + && p[2] == 3 && p[3] == 4 /* File entry */ + && p[8] == 8 /* compression == deflate */ + && p[9] == 0 /* High byte of compression */ + ) + { + return (30); + } + ++p; + } + offset = p - (const char *)buff; + } + } + + return (0); +} + +/* + * Search forward for a "PK\003\004" file header. This handles the + * case of self-extracting archives, where there is an executable + * prepended to the ZIP archive. + */ +static int +skip_sfx(struct archive_read *a) +{ + const void *h; + const char *p, *q; + size_t skip; + ssize_t bytes; + + /* + * TODO: We should be able to skip forward by a bunch + * by lifting some values from the PE header. We don't + * need to be exact (we're still going to search forward + * to find the header), but it will speed things up and + * reduce the chance of a false positive. + */ + for (;;) { + h = __archive_read_ahead(a, 4, &bytes); + if (bytes < 4) + return (ARCHIVE_FATAL); + p = h; + q = p + bytes; + + /* + * Scan ahead until we find something that looks + * like the zip header. + */ + while (p + 4 < q) { + switch (p[3]) { + case '\004': + /* TODO: Additional verification here. */ + if (memcmp("PK\003\004", p, 4) == 0) { + skip = p - (const char *)h; + __archive_read_consume(a, skip); + return (ARCHIVE_OK); + } + p += 4; + break; + case '\003': p += 1; break; + case 'K': p += 2; break; + case 'P': p += 3; break; + default: p += 4; break; + } + } + skip = p - (const char *)h; + __archive_read_consume(a, skip); + } +} + +static int +archive_read_format_zip_read_header(struct archive_read *a, + struct archive_entry *entry) +{ + const void *h; + const char *signature; + struct zip *zip; + int r = ARCHIVE_OK, r1; + + a->archive.archive_format = ARCHIVE_FORMAT_ZIP; + if (a->archive.archive_format_name == NULL) + a->archive.archive_format_name = "ZIP"; + + zip = (struct zip *)(a->format->data); + zip->decompress_init = 0; + zip->end_of_entry = 0; + zip->entry_uncompressed_bytes_read = 0; + zip->entry_compressed_bytes_read = 0; + zip->entry_crc32 = crc32(0, NULL, 0); + if ((h = __archive_read_ahead(a, 4, NULL)) == NULL) + return (ARCHIVE_FATAL); + + signature = (const char *)h; + if (signature[0] == 'M' && signature[1] == 'Z') { + /* This is an executable? Must be self-extracting... */ + r = skip_sfx(a); + if (r < ARCHIVE_WARN) + return (r); + if ((h = __archive_read_ahead(a, 4, NULL)) == NULL) + return (ARCHIVE_FATAL); + signature = (const char *)h; + } + + if (signature[0] != 'P' || signature[1] != 'K') { + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Bad ZIP file"); + return (ARCHIVE_FATAL); + } + + /* + * "PK00" signature is used for "split" archives that + * only have a single segment. This means we can just + * skip the PK00; the first real file header should follow. + */ + if (signature[2] == '0' && signature[3] == '0') { + __archive_read_consume(a, 4); + if ((h = __archive_read_ahead(a, 4, NULL)) == NULL) + return (ARCHIVE_FATAL); + signature = (const char *)h; + if (signature[0] != 'P' || signature[1] != 'K') { + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Bad ZIP file"); + return (ARCHIVE_FATAL); + } + } + + if (signature[2] == '\001' && signature[3] == '\002') { + /* Beginning of central directory. */ + return (ARCHIVE_EOF); + } + + if (signature[2] == '\003' && signature[3] == '\004') { + /* Regular file entry. */ + r1 = zip_read_file_header(a, entry, zip); + if (r1 != ARCHIVE_OK) + return (r1); + return (r); + } + + if (signature[2] == '\005' && signature[3] == '\006') { + /* End-of-archive record. */ + return (ARCHIVE_EOF); + } + + if (signature[2] == '\007' && signature[3] == '\010') { + /* + * We should never encounter this record here; + * see ZIP_LENGTH_AT_END handling below for details. + */ + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Bad ZIP file: Unexpected end-of-entry record"); + return (ARCHIVE_FATAL); + } + + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Damaged ZIP file or unsupported format variant (%d,%d)", + signature[2], signature[3]); + return (ARCHIVE_FATAL); +} + +static int +zip_read_file_header(struct archive_read *a, struct archive_entry *entry, + struct zip *zip) +{ + const struct zip_file_header *p; + const void *h; + + if ((p = __archive_read_ahead(a, sizeof *p, NULL)) == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Truncated ZIP file header"); + return (ARCHIVE_FATAL); + } + + zip->version = p->version[0]; + zip->system = p->version[1]; + zip->flags = archive_le16dec(p->flags); + zip->compression = archive_le16dec(p->compression); + if (zip->compression < + sizeof(compression_names)/sizeof(compression_names[0])) + zip->compression_name = compression_names[zip->compression]; + else + zip->compression_name = "??"; + zip->mtime = zip_time(p->timedate); + zip->ctime = 0; + zip->atime = 0; + zip->mode = 0; + zip->uid = 0; + zip->gid = 0; + zip->crc32 = archive_le32dec(p->crc32); + zip->filename_length = archive_le16dec(p->filename_length); + zip->extra_length = archive_le16dec(p->extra_length); + zip->uncompressed_size = archive_le32dec(p->uncompressed_size); + zip->compressed_size = archive_le32dec(p->compressed_size); + + __archive_read_consume(a, sizeof(struct zip_file_header)); + + + /* Read the filename. */ + if ((h = __archive_read_ahead(a, zip->filename_length, NULL)) == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Truncated ZIP file header"); + return (ARCHIVE_FATAL); + } + if (archive_string_ensure(&zip->pathname, zip->filename_length) == NULL) + __archive_errx(1, "Out of memory"); + archive_strncpy(&zip->pathname, h, zip->filename_length); + __archive_read_consume(a, zip->filename_length); + archive_entry_set_pathname(entry, zip->pathname.s); + + if (zip->pathname.s[archive_strlen(&zip->pathname) - 1] == '/') + zip->mode = AE_IFDIR | 0777; + else + zip->mode = AE_IFREG | 0777; + + /* Read the extra data. */ + if ((h = __archive_read_ahead(a, zip->extra_length, NULL)) == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Truncated ZIP file header"); + return (ARCHIVE_FATAL); + } + process_extra(h, zip); + __archive_read_consume(a, zip->extra_length); + + /* Populate some additional entry fields: */ + archive_entry_set_mode(entry, zip->mode); + archive_entry_set_uid(entry, zip->uid); + archive_entry_set_gid(entry, zip->gid); + archive_entry_set_mtime(entry, zip->mtime, 0); + archive_entry_set_ctime(entry, zip->ctime, 0); + archive_entry_set_atime(entry, zip->atime, 0); + /* Set the size only if it's meaningful. */ + if (0 == (zip->flags & ZIP_LENGTH_AT_END)) + archive_entry_set_size(entry, zip->uncompressed_size); + + zip->entry_bytes_remaining = zip->compressed_size; + zip->entry_offset = 0; + + /* If there's no body, force read_data() to return EOF immediately. */ + if (0 == (zip->flags & ZIP_LENGTH_AT_END) + && zip->entry_bytes_remaining < 1) + zip->end_of_entry = 1; + + /* Set up a more descriptive format name. */ + sprintf(zip->format_name, "ZIP %d.%d (%s)", + zip->version / 10, zip->version % 10, + zip->compression_name); + a->archive.archive_format_name = zip->format_name; + + return (ARCHIVE_OK); +} + +/* Convert an MSDOS-style date/time into Unix-style time. */ +static time_t +zip_time(const char *p) +{ + int msTime, msDate; + struct tm ts; + + msTime = (0xff & (unsigned)p[0]) + 256 * (0xff & (unsigned)p[1]); + msDate = (0xff & (unsigned)p[2]) + 256 * (0xff & (unsigned)p[3]); + + memset(&ts, 0, sizeof(ts)); + ts.tm_year = ((msDate >> 9) & 0x7f) + 80; /* Years since 1900. */ + ts.tm_mon = ((msDate >> 5) & 0x0f) - 1; /* Month number. */ + ts.tm_mday = msDate & 0x1f; /* Day of month. */ + ts.tm_hour = (msTime >> 11) & 0x1f; + ts.tm_min = (msTime >> 5) & 0x3f; + ts.tm_sec = (msTime << 1) & 0x3e; + ts.tm_isdst = -1; + return mktime(&ts); +} + +static int +archive_read_format_zip_read_data(struct archive_read *a, + const void **buff, size_t *size, off_t *offset) +{ + int r; + struct zip *zip; + + zip = (struct zip *)(a->format->data); + + /* + * If we hit end-of-entry last time, clean up and return + * ARCHIVE_EOF this time. + */ + if (zip->end_of_entry) { + *offset = zip->entry_uncompressed_bytes_read; + *size = 0; + *buff = NULL; + return (ARCHIVE_EOF); + } + + switch(zip->compression) { + case 0: /* No compression. */ + r = zip_read_data_none(a, buff, size, offset); + break; + case 8: /* Deflate compression. */ + r = zip_read_data_deflate(a, buff, size, offset); + break; + default: /* Unsupported compression. */ + *buff = NULL; + *size = 0; + *offset = 0; + /* Return a warning. */ + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Unsupported ZIP compression method (%s)", + zip->compression_name); + if (zip->flags & ZIP_LENGTH_AT_END) { + /* + * ZIP_LENGTH_AT_END requires us to + * decompress the entry in order to + * skip it, but we don't know this + * compression method, so we give up. + */ + r = ARCHIVE_FATAL; + } else { + /* We can't decompress this entry, but we will + * be able to skip() it and try the next entry. */ + r = ARCHIVE_WARN; + } + break; + } + if (r != ARCHIVE_OK) + return (r); + /* Update checksum */ + if (*size) + zip->entry_crc32 = crc32(zip->entry_crc32, *buff, *size); + /* If we hit the end, swallow any end-of-data marker. */ + if (zip->end_of_entry) { + if (zip->flags & ZIP_LENGTH_AT_END) { + const char *p; + + if ((p = __archive_read_ahead(a, 16, NULL)) == NULL) { + archive_set_error(&a->archive, + ARCHIVE_ERRNO_FILE_FORMAT, + "Truncated ZIP end-of-file record"); + return (ARCHIVE_FATAL); + } + zip->crc32 = archive_le32dec(p + 4); + zip->compressed_size = archive_le32dec(p + 8); + zip->uncompressed_size = archive_le32dec(p + 12); + __archive_read_consume(a, 16); + } + /* Check file size, CRC against these values. */ + if (zip->compressed_size != zip->entry_compressed_bytes_read) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "ZIP compressed data is wrong size"); + return (ARCHIVE_WARN); + } + /* Size field only stores the lower 32 bits of the actual size. */ + if ((zip->uncompressed_size & UINT32_MAX) + != (zip->entry_uncompressed_bytes_read & UINT32_MAX)) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "ZIP uncompressed data is wrong size"); + return (ARCHIVE_WARN); + } + /* Check computed CRC against header */ + if (zip->crc32 != zip->entry_crc32) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "ZIP bad CRC: 0x%lx should be 0x%lx", + zip->entry_crc32, zip->crc32); + return (ARCHIVE_WARN); + } + } + + /* Return EOF immediately if this is a non-regular file. */ + if (AE_IFREG != (zip->mode & AE_IFMT)) + return (ARCHIVE_EOF); + return (ARCHIVE_OK); +} + +/* + * Read "uncompressed" data. According to the current specification, + * if ZIP_LENGTH_AT_END is specified, then the size fields in the + * initial file header are supposed to be set to zero. This would, of + * course, make it impossible for us to read the archive, since we + * couldn't determine the end of the file data. Info-ZIP seems to + * include the real size fields both before and after the data in this + * case (the CRC only appears afterwards), so this works as you would + * expect. + * + * Returns ARCHIVE_OK if successful, ARCHIVE_FATAL otherwise, sets + * zip->end_of_entry if it consumes all of the data. + */ +static int +zip_read_data_none(struct archive_read *a, const void **buff, + size_t *size, off_t *offset) +{ + struct zip *zip; + ssize_t bytes_avail; + + zip = (struct zip *)(a->format->data); + + if (zip->entry_bytes_remaining == 0) { + *buff = NULL; + *size = 0; + *offset = zip->entry_offset; + zip->end_of_entry = 1; + return (ARCHIVE_OK); + } + /* + * Note: '1' here is a performance optimization. + * Recall that the decompression layer returns a count of + * available bytes; asking for more than that forces the + * decompressor to combine reads by copying data. + */ + *buff = __archive_read_ahead(a, 1, &bytes_avail); + if (bytes_avail <= 0) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Truncated ZIP file data"); + return (ARCHIVE_FATAL); + } + if (bytes_avail > zip->entry_bytes_remaining) + bytes_avail = zip->entry_bytes_remaining; + __archive_read_consume(a, bytes_avail); + *size = bytes_avail; + *offset = zip->entry_offset; + zip->entry_offset += *size; + zip->entry_bytes_remaining -= *size; + zip->entry_uncompressed_bytes_read += *size; + zip->entry_compressed_bytes_read += *size; + return (ARCHIVE_OK); +} + +#ifdef HAVE_ZLIB_H +static int +zip_read_data_deflate(struct archive_read *a, const void **buff, + size_t *size, off_t *offset) +{ + struct zip *zip; + ssize_t bytes_avail; + const void *compressed_buff; + int r; + + zip = (struct zip *)(a->format->data); + + /* If the buffer hasn't been allocated, allocate it now. */ + if (zip->uncompressed_buffer == NULL) { + zip->uncompressed_buffer_size = 32 * 1024; + zip->uncompressed_buffer + = (unsigned char *)malloc(zip->uncompressed_buffer_size); + if (zip->uncompressed_buffer == NULL) { + archive_set_error(&a->archive, ENOMEM, + "No memory for ZIP decompression"); + return (ARCHIVE_FATAL); + } + } + + /* If we haven't yet read any data, initialize the decompressor. */ + if (!zip->decompress_init) { + if (zip->stream_valid) + r = inflateReset(&zip->stream); + else + r = inflateInit2(&zip->stream, + -15 /* Don't check for zlib header */); + if (r != Z_OK) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Can't initialize ZIP decompression."); + return (ARCHIVE_FATAL); + } + /* Stream structure has been set up. */ + zip->stream_valid = 1; + /* We've initialized decompression for this stream. */ + zip->decompress_init = 1; + } + + /* + * Note: '1' here is a performance optimization. + * Recall that the decompression layer returns a count of + * available bytes; asking for more than that forces the + * decompressor to combine reads by copying data. + */ + compressed_buff = __archive_read_ahead(a, 1, &bytes_avail); + if (bytes_avail <= 0) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Truncated ZIP file body"); + return (ARCHIVE_FATAL); + } + + /* + * A bug in zlib.h: stream.next_in should be marked 'const' + * but isn't (the library never alters data through the + * next_in pointer, only reads it). The result: this ugly + * cast to remove 'const'. + */ + zip->stream.next_in = (Bytef *)(uintptr_t)(const void *)compressed_buff; + zip->stream.avail_in = bytes_avail; + zip->stream.total_in = 0; + zip->stream.next_out = zip->uncompressed_buffer; + zip->stream.avail_out = zip->uncompressed_buffer_size; + zip->stream.total_out = 0; + + r = inflate(&zip->stream, 0); + switch (r) { + case Z_OK: + break; + case Z_STREAM_END: + zip->end_of_entry = 1; + break; + case Z_MEM_ERROR: + archive_set_error(&a->archive, ENOMEM, + "Out of memory for ZIP decompression"); + return (ARCHIVE_FATAL); + default: + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "ZIP decompression failed (%d)", r); + return (ARCHIVE_FATAL); + } + + /* Consume as much as the compressor actually used. */ + bytes_avail = zip->stream.total_in; + __archive_read_consume(a, bytes_avail); + zip->entry_bytes_remaining -= bytes_avail; + zip->entry_compressed_bytes_read += bytes_avail; + + *offset = zip->entry_offset; + *size = zip->stream.total_out; + zip->entry_uncompressed_bytes_read += *size; + *buff = zip->uncompressed_buffer; + zip->entry_offset += *size; + return (ARCHIVE_OK); +} +#else +static int +zip_read_data_deflate(struct archive_read *a, const void **buff, + size_t *size, off_t *offset) +{ + *buff = NULL; + *size = 0; + *offset = 0; + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "libarchive compiled without deflate support (no libz)"); + return (ARCHIVE_FATAL); +} +#endif + +static int +archive_read_format_zip_read_data_skip(struct archive_read *a) +{ + struct zip *zip; + const void *buff = NULL; + off_t bytes_skipped; + + zip = (struct zip *)(a->format->data); + + /* If we've already read to end of data, we're done. */ + if (zip->end_of_entry) + return (ARCHIVE_OK); + + /* + * If the length is at the end, we have no choice but + * to decompress all the data to find the end marker. + */ + if (zip->flags & ZIP_LENGTH_AT_END) { + size_t size; + off_t offset; + int r; + do { + r = archive_read_format_zip_read_data(a, &buff, + &size, &offset); + } while (r == ARCHIVE_OK); + return (r); + } + + /* + * If the length is at the beginning, we can skip the + * compressed data much more quickly. + */ + bytes_skipped = __archive_read_skip(a, zip->entry_bytes_remaining); + if (bytes_skipped < 0) + return (ARCHIVE_FATAL); + + /* This entry is finished and done. */ + zip->end_of_entry = 1; + return (ARCHIVE_OK); +} + +static int +archive_read_format_zip_cleanup(struct archive_read *a) +{ + struct zip *zip; + + zip = (struct zip *)(a->format->data); +#ifdef HAVE_ZLIB_H + if (zip->stream_valid) + inflateEnd(&zip->stream); +#endif + free(zip->uncompressed_buffer); + archive_string_free(&(zip->pathname)); + archive_string_free(&(zip->extra)); + free(zip); + (a->format->data) = NULL; + return (ARCHIVE_OK); +} + +/* + * The extra data is stored as a list of + * id1+size1+data1 + id2+size2+data2 ... + * triplets. id and size are 2 bytes each. + */ +static void +process_extra(const void* extra, struct zip* zip) +{ + int offset = 0; + const char *p = (const char *)extra; + while (offset < zip->extra_length - 4) + { + unsigned short headerid = archive_le16dec(p + offset); + unsigned short datasize = archive_le16dec(p + offset + 2); + offset += 4; + if (offset + datasize > zip->extra_length) + break; +#ifdef DEBUG + fprintf(stderr, "Header id 0x%04x, length %d\n", + headerid, datasize); +#endif + switch (headerid) { + case 0x0001: + /* Zip64 extended information extra field. */ +#ifndef __minix + if (datasize >= 8) + zip->uncompressed_size = archive_le64dec(p + offset); + if (datasize >= 16) + zip->compressed_size = archive_le64dec(p + offset + 8); + break; +#else + /* Minix file system does not support sizes that require 64 bit + * support so we can safely down cast this + */ + if (datasize >= 8) + zip->uncompressed_size = cv64ul(archive_le64dec(p + offset)); + if (datasize >= 16) + zip->compressed_size = cv64ul(archive_le64dec(p + offset + 8)); + break; +#endif + case 0x5455: + { + /* Extended time field "UT". */ + int flags = p[offset]; + offset++; + datasize--; + /* Flag bits indicate which dates are present. */ + if (flags & 0x01) + { +#ifdef DEBUG + fprintf(stderr, "mtime: %lld -> %d\n", + (long long)zip->mtime, + archive_le32dec(p + offset)); +#endif + if (datasize < 4) + break; + zip->mtime = archive_le32dec(p + offset); + offset += 4; + datasize -= 4; + } + if (flags & 0x02) + { + if (datasize < 4) + break; + zip->atime = archive_le32dec(p + offset); + offset += 4; + datasize -= 4; + } + if (flags & 0x04) + { + if (datasize < 4) + break; + zip->ctime = archive_le32dec(p + offset); + offset += 4; + datasize -= 4; + } + break; + } + case 0x7855: + /* Info-ZIP Unix Extra Field (type 2) "Ux". */ +#ifdef DEBUG + fprintf(stderr, "uid %d gid %d\n", + archive_le16dec(p + offset), + archive_le16dec(p + offset + 2)); +#endif + if (datasize >= 2) + zip->uid = archive_le16dec(p + offset); + if (datasize >= 4) + zip->gid = archive_le16dec(p + offset + 2); + break; + default: + break; + } + offset += datasize; + } +#ifdef DEBUG + if (offset != zip->extra_length) + { + fprintf(stderr, + "Extra data field contents do not match reported size!"); + } +#endif +} diff --git a/lib/libarchive/archive_string.c b/lib/libarchive/archive_string.c new file mode 100644 index 000000000..4e57d62e1 --- /dev/null +++ b/lib/libarchive/archive_string.c @@ -0,0 +1,453 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_string.c 201095 2009-12-28 02:33:22Z kientzle $"); + +/* + * Basic resizable string support, to simplify manipulating arbitrary-sized + * strings while minimizing heap activity. + */ + +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#ifdef HAVE_WCHAR_H +#include +#endif +#if defined(_WIN32) && !defined(__CYGWIN__) +#include +#endif + +#include "archive_private.h" +#include "archive_string.h" + +struct archive_string * +__archive_string_append(struct archive_string *as, const char *p, size_t s) +{ + if (__archive_string_ensure(as, as->length + s + 1) == NULL) + __archive_errx(1, "Out of memory"); + memcpy(as->s + as->length, p, s); + as->s[as->length + s] = 0; + as->length += s; + return (as); +} + +void +__archive_string_copy(struct archive_string *dest, struct archive_string *src) +{ + if (src->length == 0) + dest->length = 0; + else { + if (__archive_string_ensure(dest, src->length + 1) == NULL) + __archive_errx(1, "Out of memory"); + memcpy(dest->s, src->s, src->length); + dest->length = src->length; + dest->s[dest->length] = 0; + } +} + +void +__archive_string_concat(struct archive_string *dest, struct archive_string *src) +{ + if (src->length > 0) { + if (__archive_string_ensure(dest, dest->length + src->length + 1) == NULL) + __archive_errx(1, "Out of memory"); + memcpy(dest->s + dest->length, src->s, src->length); + dest->length += src->length; + dest->s[dest->length] = 0; + } +} + +void +__archive_string_free(struct archive_string *as) +{ + as->length = 0; + as->buffer_length = 0; + if (as->s != NULL) { + free(as->s); + as->s = NULL; + } +} + +/* Returns NULL on any allocation failure. */ +struct archive_string * +__archive_string_ensure(struct archive_string *as, size_t s) +{ + /* If buffer is already big enough, don't reallocate. */ + if (as->s && (s <= as->buffer_length)) + return (as); + + /* + * Growing the buffer at least exponentially ensures that + * append operations are always linear in the number of + * characters appended. Using a smaller growth rate for + * larger buffers reduces memory waste somewhat at the cost of + * a larger constant factor. + */ + if (as->buffer_length < 32) + /* Start with a minimum 32-character buffer. */ + as->buffer_length = 32; + else if (as->buffer_length < 8192) + /* Buffers under 8k are doubled for speed. */ + as->buffer_length += as->buffer_length; + else { + /* Buffers 8k and over grow by at least 25% each time. */ + size_t old_length = as->buffer_length; + as->buffer_length += as->buffer_length / 4; + /* Be safe: If size wraps, release buffer and return NULL. */ + if (as->buffer_length < old_length) { + free(as->s); + as->s = NULL; + return (NULL); + } + } + /* + * The computation above is a lower limit to how much we'll + * grow the buffer. In any case, we have to grow it enough to + * hold the request. + */ + if (as->buffer_length < s) + as->buffer_length = s; + /* Now we can reallocate the buffer. */ + as->s = (char *)realloc(as->s, as->buffer_length); + if (as->s == NULL) + return (NULL); + return (as); +} + +struct archive_string * +__archive_strncat(struct archive_string *as, const void *_p, size_t n) +{ + size_t s; + const char *p, *pp; + + p = (const char *)_p; + + /* Like strlen(p), except won't examine positions beyond p[n]. */ + s = 0; + pp = p; + while (*pp && s < n) { + pp++; + s++; + } + return (__archive_string_append(as, p, s)); +} + +struct archive_string * +__archive_strappend_char(struct archive_string *as, char c) +{ + return (__archive_string_append(as, &c, 1)); +} + +/* + * Translates a wide character string into UTF-8 and appends + * to the archive_string. Note: returns NULL if conversion fails, + * but still leaves a best-effort conversion in the argument as. + */ +struct archive_string * +__archive_strappend_w_utf8(struct archive_string *as, const wchar_t *w) +{ + char *p; + unsigned wc; + char buff[256]; + struct archive_string *return_val = as; + + /* + * Convert one wide char at a time into 'buff', whenever that + * fills, append it to the string. + */ + p = buff; + while (*w != L'\0') { + /* Flush the buffer when we have <=16 bytes free. */ + /* (No encoding has a single character >16 bytes.) */ + if ((size_t)(p - buff) >= (size_t)(sizeof(buff) - 16)) { + *p = '\0'; + archive_strcat(as, buff); + p = buff; + } + wc = *w++; + /* If this is a surrogate pair, assemble the full code point.*/ + /* Note: wc must not be wchar_t here, because the full code + * point can be more than 16 bits! */ + if (wc >= 0xD800 && wc <= 0xDBff + && *w >= 0xDC00 && *w <= 0xDFFF) { + wc -= 0xD800; + wc *= 0x400; + wc += (*w - 0xDC00); + wc += 0x10000; + ++w; + } + /* Translate code point to UTF8 */ + if (wc <= 0x7f) { + *p++ = (char)wc; + } else if (wc <= 0x7ff) { + *p++ = 0xc0 | ((wc >> 6) & 0x1f); + *p++ = 0x80 | (wc & 0x3f); + } else if (wc <= 0xffff) { + *p++ = 0xe0 | ((wc >> 12) & 0x0f); + *p++ = 0x80 | ((wc >> 6) & 0x3f); + *p++ = 0x80 | (wc & 0x3f); + } else if (wc <= 0x1fffff) { + *p++ = 0xf0 | ((wc >> 18) & 0x07); + *p++ = 0x80 | ((wc >> 12) & 0x3f); + *p++ = 0x80 | ((wc >> 6) & 0x3f); + *p++ = 0x80 | (wc & 0x3f); + } else { + /* Unicode has no codes larger than 0x1fffff. */ + /* TODO: use \uXXXX escape here instead of ? */ + *p++ = '?'; + return_val = NULL; + } + } + *p = '\0'; + archive_strcat(as, buff); + return (return_val); +} + +static int +utf8_to_unicode(int *pwc, const char *s, size_t n) +{ + int ch; + + /* + * Decode 1-4 bytes depending on the value of the first byte. + */ + ch = (unsigned char)*s; + if (ch == 0) { + return (0); /* Standard: return 0 for end-of-string. */ + } + if ((ch & 0x80) == 0) { + *pwc = ch & 0x7f; + return (1); + } + if ((ch & 0xe0) == 0xc0) { + if (n < 2) + return (-1); + if ((s[1] & 0xc0) != 0x80) return (-1); + *pwc = ((ch & 0x1f) << 6) | (s[1] & 0x3f); + return (2); + } + if ((ch & 0xf0) == 0xe0) { + if (n < 3) + return (-1); + if ((s[1] & 0xc0) != 0x80) return (-1); + if ((s[2] & 0xc0) != 0x80) return (-1); + *pwc = ((ch & 0x0f) << 12) + | ((s[1] & 0x3f) << 6) + | (s[2] & 0x3f); + return (3); + } + if ((ch & 0xf8) == 0xf0) { + if (n < 4) + return (-1); + if ((s[1] & 0xc0) != 0x80) return (-1); + if ((s[2] & 0xc0) != 0x80) return (-1); + if ((s[3] & 0xc0) != 0x80) return (-1); + *pwc = ((ch & 0x07) << 18) + | ((s[1] & 0x3f) << 12) + | ((s[2] & 0x3f) << 6) + | (s[3] & 0x3f); + return (4); + } + /* Invalid first byte. */ + return (-1); +} + +/* + * Return a wide-character Unicode string by converting this archive_string + * from UTF-8. We assume that systems with 16-bit wchar_t always use + * UTF16 and systems with 32-bit wchar_t can accept UCS4. + */ +wchar_t * +__archive_string_utf8_w(struct archive_string *as) +{ + wchar_t *ws, *dest; + int wc, wc2;/* Must be large enough for a 21-bit Unicode code point. */ + const char *src; + int n; + + ws = (wchar_t *)malloc((as->length + 1) * sizeof(wchar_t)); + if (ws == NULL) + __archive_errx(1, "Out of memory"); + dest = ws; + src = as->s; + while (*src != '\0') { + n = utf8_to_unicode(&wc, src, 8); + if (n == 0) + break; + if (n < 0) { + free(ws); + return (NULL); + } + src += n; + if (wc >= 0xDC00 && wc <= 0xDBFF) { + /* This is a leading surrogate; some idiot + * has translated UTF16 to UTF8 without combining + * surrogates; rebuild the full code point before + * continuing. */ + n = utf8_to_unicode(&wc2, src, 8); + if (n < 0) { + free(ws); + return (NULL); + } + if (n == 0) /* Ignore the leading surrogate */ + break; + if (wc2 < 0xDC00 || wc2 > 0xDFFF) { + /* If the second character isn't a + * trailing surrogate, then someone + * has really screwed up and this is + * invalid. */ + free(ws); + return (NULL); + } else { + src += n; + wc -= 0xD800; + wc *= 0x400; + wc += wc2 - 0xDC00; + wc += 0x10000; + } + } + if ((sizeof(wchar_t) < 4) && (wc > 0xffff)) { + /* We have a code point that won't fit into a + * wchar_t; convert it to a surrogate pair. */ + wc -= 0x10000; + *dest++ = ((wc >> 10) & 0x3ff) + 0xD800; + *dest++ = (wc & 0x3ff) + 0xDC00; + } else + *dest++ = wc; + } + *dest = L'\0'; + return (ws); +} + +#if defined(_WIN32) && !defined(__CYGWIN__) + +/* + * Translates a wide character string into current locale character set + * and appends to the archive_string. Note: returns NULL if conversion + * fails. + * + * Win32 builds use WideCharToMultiByte from the Windows API. + * (Maybe Cygwin should too? WideCharToMultiByte will know a + * lot more about local character encodings than the wcrtomb() + * wrapper is going to know.) + */ +struct archive_string * +__archive_strappend_w_mbs(struct archive_string *as, const wchar_t *w) +{ + char *p; + int l, wl; + BOOL useDefaultChar = FALSE; + + wl = (int)wcslen(w); + l = wl * 4 + 4; + p = malloc(l); + if (p == NULL) + __archive_errx(1, "Out of memory"); + /* To check a useDefaultChar is to simulate error handling of + * the my_wcstombs() which is running on non Windows system with + * wctomb(). + * And to set NULL for last argument is necessary when a codepage + * is not CP_ACP(current locale). + */ + l = WideCharToMultiByte(CP_ACP, 0, w, wl, p, l, NULL, &useDefaultChar); + if (l == 0) { + free(p); + return (NULL); + } + __archive_string_append(as, p, l); + free(p); + return (as); +} + +#else + +/* + * Translates a wide character string into current locale character set + * and appends to the archive_string. Note: returns NULL if conversion + * fails. + * + * Non-Windows uses ISO C wcrtomb() or wctomb() to perform the conversion + * one character at a time. If a non-Windows platform doesn't have + * either of these, fall back to the built-in UTF8 conversion. + */ +struct archive_string * +__archive_strappend_w_mbs(struct archive_string *as, const wchar_t *w) +{ +#if !defined(HAVE_WCTOMB) && !defined(HAVE_WCRTOMB) + /* If there's no built-in locale support, fall back to UTF8 always. */ + return __archive_strappend_w_utf8(as, w); +#else + /* We cannot use the standard wcstombs() here because it + * cannot tell us how big the output buffer should be. So + * I've built a loop around wcrtomb() or wctomb() that + * converts a character at a time and resizes the string as + * needed. We prefer wcrtomb() when it's available because + * it's thread-safe. */ + int n; + char *p; + char buff[256]; +#if HAVE_WCRTOMB + mbstate_t shift_state; + + memset(&shift_state, 0, sizeof(shift_state)); +#else + /* Clear the shift state before starting. */ + wctomb(NULL, L'\0'); +#endif + + /* + * Convert one wide char at a time into 'buff', whenever that + * fills, append it to the string. + */ + p = buff; + while (*w != L'\0') { + /* Flush the buffer when we have <=16 bytes free. */ + /* (No encoding has a single character >16 bytes.) */ + if ((size_t)(p - buff) >= (size_t)(sizeof(buff) - MB_CUR_MAX)) { + *p = '\0'; + archive_strcat(as, buff); + p = buff; + } +#if HAVE_WCRTOMB + n = wcrtomb(p, *w++, &shift_state); +#else + n = wctomb(p, *w++); +#endif + if (n == -1) + return (NULL); + p += n; + } + *p = '\0'; + archive_strcat(as, buff); + return (as); +#endif +} + +#endif /* _WIN32 && ! __CYGWIN__ */ diff --git a/lib/libarchive/archive_string.h b/lib/libarchive/archive_string.h new file mode 100644 index 000000000..25d7a8b1e --- /dev/null +++ b/lib/libarchive/archive_string.h @@ -0,0 +1,148 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD: head/lib/libarchive/archive_string.h 201092 2009-12-28 02:26:06Z kientzle $ + * + */ + +#ifndef __LIBARCHIVE_BUILD +#error This header is only to be used internally to libarchive. +#endif + +#ifndef ARCHIVE_STRING_H_INCLUDED +#define ARCHIVE_STRING_H_INCLUDED + +#include +#ifdef HAVE_STDLIB_H +#include /* required for wchar_t on some systems */ +#endif +#ifdef HAVE_STRING_H +#include +#endif +#ifdef HAVE_WCHAR_H +#include +#endif + +/* + * Basic resizable/reusable string support a la Java's "StringBuffer." + * + * Unlike sbuf(9), the buffers here are fully reusable and track the + * length throughout. + * + * Note that all visible symbols here begin with "__archive" as they + * are internal symbols not intended for anyone outside of this library + * to see or use. + */ + +struct archive_string { + char *s; /* Pointer to the storage */ + size_t length; /* Length of 's' */ + size_t buffer_length; /* Length of malloc-ed storage */ +}; + +/* Initialize an archive_string object on the stack or elsewhere. */ +#define archive_string_init(a) \ + do { (a)->s = NULL; (a)->length = 0; (a)->buffer_length = 0; } while(0) + +/* Append a C char to an archive_string, resizing as necessary. */ +struct archive_string * +__archive_strappend_char(struct archive_string *, char); +#define archive_strappend_char __archive_strappend_char + +/* Convert a wide-char string to UTF-8 and append the result. */ +struct archive_string * +__archive_strappend_w_utf8(struct archive_string *, const wchar_t *); +#define archive_strappend_w_utf8 __archive_strappend_w_utf8 + +/* Convert a wide-char string to current locale and append the result. */ +/* Returns NULL if conversion fails. */ +struct archive_string * +__archive_strappend_w_mbs(struct archive_string *, const wchar_t *); +#define archive_strappend_w_mbs __archive_strappend_w_mbs + +/* Basic append operation. */ +struct archive_string * +__archive_string_append(struct archive_string *as, const char *p, size_t s); + +/* Copy one archive_string to another */ +void +__archive_string_copy(struct archive_string *dest, struct archive_string *src); +#define archive_string_copy(dest, src) \ + __archive_string_copy(dest, src) + +/* Concatenate one archive_string to another */ +void +__archive_string_concat(struct archive_string *dest, struct archive_string *src); +#define archive_string_concat(dest, src) \ + __archive_string_concat(dest, src) + +/* Ensure that the underlying buffer is at least as large as the request. */ +struct archive_string * +__archive_string_ensure(struct archive_string *, size_t); +#define archive_string_ensure __archive_string_ensure + +/* Append C string, which may lack trailing \0. */ +/* The source is declared void * here because this gets used with + * "signed char *", "unsigned char *" and "char *" arguments. + * Declaring it "char *" as with some of the other functions just + * leads to a lot of extra casts. */ +struct archive_string * +__archive_strncat(struct archive_string *, const void *, size_t); +#define archive_strncat __archive_strncat + +/* Append a C string to an archive_string, resizing as necessary. */ +#define archive_strcat(as,p) __archive_string_append((as),(p),strlen(p)) + +/* Copy a C string to an archive_string, resizing as necessary. */ +#define archive_strcpy(as,p) \ + ((as)->length = 0, __archive_string_append((as), (p), p == NULL ? 0 : strlen(p))) + +/* Copy a C string to an archive_string with limit, resizing as necessary. */ +#define archive_strncpy(as,p,l) \ + ((as)->length=0, archive_strncat((as), (p), (l))) + +/* Return length of string. */ +#define archive_strlen(a) ((a)->length) + +/* Set string length to zero. */ +#define archive_string_empty(a) ((a)->length = 0) + +/* Release any allocated storage resources. */ +void __archive_string_free(struct archive_string *); +#define archive_string_free __archive_string_free + +/* Like 'vsprintf', but resizes the underlying string as necessary. */ +void __archive_string_vsprintf(struct archive_string *, const char *, + va_list); +#define archive_string_vsprintf __archive_string_vsprintf + +void __archive_string_sprintf(struct archive_string *, const char *, ...); +#define archive_string_sprintf __archive_string_sprintf + +/* Allocates a fresh buffer and converts as (assumed to be UTF-8) into it. + * Returns NULL if conversion failed in any way. */ +wchar_t *__archive_string_utf8_w(struct archive_string *as); + + +#endif diff --git a/lib/libarchive/archive_string_sprintf.c b/lib/libarchive/archive_string_sprintf.c new file mode 100644 index 000000000..6d3d8edde --- /dev/null +++ b/lib/libarchive/archive_string_sprintf.c @@ -0,0 +1,164 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_string_sprintf.c 189435 2009-03-06 05:14:55Z kientzle $"); + +/* + * The use of printf()-family functions can be troublesome + * for space-constrained applications. In addition, correctly + * implementing this function in terms of vsnprintf() requires + * two calls (one to determine the size, another to format the + * result), which in turn requires duplicating the argument list + * using va_copy, which isn't yet universally available. + * + * So, I've implemented a bare minimum of printf()-like capability + * here. This is only used to format error messages, so doesn't + * require any floating-point support or field-width handling. + */ + +#include + +#include "archive_string.h" +#include "archive_private.h" + +/* + * Utility functions to format signed/unsigned integers and append + * them to an archive_string. + */ +static void +append_uint(struct archive_string *as, uintmax_t d, unsigned base) +{ + static const char *digits = "0123456789abcdef"; + if (d >= base) + append_uint(as, d/base, base); + archive_strappend_char(as, digits[d % base]); +} + +static void +append_int(struct archive_string *as, intmax_t d, unsigned base) +{ + if (d < 0) { + archive_strappend_char(as, '-'); + d = -d; + } + append_uint(as, d, base); +} + + +void +__archive_string_sprintf(struct archive_string *as, const char *fmt, ...) +{ + va_list ap; + + va_start(ap, fmt); + archive_string_vsprintf(as, fmt, ap); + va_end(ap); +} + +/* + * Like 'vsprintf', but ensures the target is big enough, resizing if + * necessary. + */ +void +__archive_string_vsprintf(struct archive_string *as, const char *fmt, + va_list ap) +{ + char long_flag; + intmax_t s; /* Signed integer temp. */ + uintmax_t u; /* Unsigned integer temp. */ + const char *p, *p2; + + if (__archive_string_ensure(as, 64) == NULL) + __archive_errx(1, "Out of memory"); + + if (fmt == NULL) { + as->s[0] = 0; + return; + } + + for (p = fmt; *p != '\0'; p++) { + const char *saved_p = p; + + if (*p != '%') { + archive_strappend_char(as, *p); + continue; + } + + p++; + + long_flag = '\0'; + switch(*p) { + case 'j': + long_flag = 'j'; + p++; + break; + case 'l': + long_flag = 'l'; + p++; + break; + } + + switch (*p) { + case '%': + __archive_strappend_char(as, '%'); + break; + case 'c': + s = va_arg(ap, int); + __archive_strappend_char(as, s); + break; + case 'd': + switch(long_flag) { + case 'j': s = va_arg(ap, intmax_t); break; + case 'l': s = va_arg(ap, long); break; + default: s = va_arg(ap, int); break; + } + append_int(as, s, 10); + break; + case 's': + p2 = va_arg(ap, char *); + archive_strcat(as, p2); + break; + case 'o': case 'u': case 'x': case 'X': + /* Common handling for unsigned integer formats. */ + switch(long_flag) { + case 'j': u = va_arg(ap, uintmax_t); break; + case 'l': u = va_arg(ap, unsigned long); break; + default: u = va_arg(ap, unsigned int); break; + } + /* Format it in the correct base. */ + switch (*p) { + case 'o': append_uint(as, u, 8); break; + case 'u': append_uint(as, u, 10); break; + default: append_uint(as, u, 16); break; + } + break; + default: + /* Rewind and print the initial '%' literally. */ + p = saved_p; + archive_strappend_char(as, *p); + } + } +} diff --git a/lib/libarchive/archive_util.3 b/lib/libarchive/archive_util.3 new file mode 100644 index 000000000..98609e565 --- /dev/null +++ b/lib/libarchive/archive_util.3 @@ -0,0 +1,160 @@ +.\" Copyright (c) 2003-2007 Tim Kientzle +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD: head/lib/libarchive/archive_util.3 201098 2009-12-28 02:58:14Z kientzle $ +.\" +.Dd January 8, 2005 +.Dt archive_util 3 +.Os +.Sh NAME +.Nm archive_clear_error , +.Nm archive_compression , +.Nm archive_compression_name , +.Nm archive_copy_error , +.Nm archive_errno , +.Nm archive_error_string , +.Nm archive_file_count , +.Nm archive_format , +.Nm archive_format_name , +.Nm archive_set_error +.Nd libarchive utility functions +.Sh SYNOPSIS +.In archive.h +.Ft void +.Fn archive_clear_error "struct archive *" +.Ft int +.Fn archive_compression "struct archive *" +.Ft const char * +.Fn archive_compression_name "struct archive *" +.Ft void +.Fn archive_copy_error "struct archive *" "struct archive *" +.Ft int +.Fn archive_errno "struct archive *" +.Ft const char * +.Fn archive_error_string "struct archive *" +.Ft int +.Fn archive_file_count "struct archive *" +.Ft int +.Fn archive_format "struct archive *" +.Ft const char * +.Fn archive_format_name "struct archive *" +.Ft void +.Fo archive_set_error +.Fa "struct archive *" +.Fa "int error_code" +.Fa "const char *fmt" +.Fa "..." +.Fc +.Sh DESCRIPTION +These functions provide access to various information about the +.Tn struct archive +object used in the +.Xr libarchive 3 +library. +.Bl -tag -compact -width indent +.It Fn archive_clear_error +Clears any error information left over from a previous call. +Not generally used in client code. +.It Fn archive_compression +Returns a numeric code indicating the current compression. +This value is set by +.Fn archive_read_open . +.It Fn archive_compression_name +Returns a text description of the current compression suitable for display. +.It Fn archive_copy_error +Copies error information from one archive to another. +.It Fn archive_errno +Returns a numeric error code (see +.Xr errno 2 ) +indicating the reason for the most recent error return. +.It Fn archive_error_string +Returns a textual error message suitable for display. +The error message here is usually more specific than that +obtained from passing the result of +.Fn archive_errno +to +.Xr strerror 3 . +.It Fn archive_file_count +Returns a count of the number of files processed by this archive object. +The count is incremented by calls to +.Xr archive_write_header +or +.Xr archive_read_next_header . +.It Fn archive_format +Returns a numeric code indicating the format of the current +archive entry. +This value is set by a successful call to +.Fn archive_read_next_header . +Note that it is common for this value to change from +entry to entry. +For example, a tar archive might have several entries that +utilize GNU tar extensions and several entries that do not. +These entries will have different format codes. +.It Fn archive_format_name +A textual description of the format of the current entry. +.It Fn archive_set_error +Sets the numeric error code and error description that will be returned +by +.Fn archive_errno +and +.Fn archive_error_string . +This function should be used within I/O callbacks to set system-specific +error codes and error descriptions. +This function accepts a printf-like format string and arguments. +However, you should be careful to use only the following printf +format specifiers: +.Dq %c , +.Dq %d , +.Dq %jd , +.Dq %jo , +.Dq %ju , +.Dq %jx , +.Dq %ld , +.Dq %lo , +.Dq %lu , +.Dq %lx , +.Dq %o , +.Dq %u , +.Dq %s , +.Dq %x , +.Dq %% . +Field-width specifiers and other printf features are +not uniformly supported and should not be used. +.El +.Sh SEE ALSO +.Xr archive_read 3 , +.Xr archive_write 3 , +.Xr libarchive 3 , +.Xr printf 3 +.Sh HISTORY +The +.Nm libarchive +library first appeared in +.Fx 5.3 . +.Sh AUTHORS +.An -nosplit +The +.Nm libarchive +library was written by +.An Tim Kientzle Aq kientzle@acm.org . diff --git a/lib/libarchive/archive_util.c b/lib/libarchive/archive_util.c new file mode 100644 index 000000000..be78d4cd9 --- /dev/null +++ b/lib/libarchive/archive_util.c @@ -0,0 +1,406 @@ +/*- + * Copyright (c) 2009 Michihiro NAKAJIMA + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_util.c 201098 2009-12-28 02:58:14Z kientzle $"); + +#ifdef HAVE_SYS_TYPES_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif + +#include "archive.h" +#include "archive_private.h" +#include "archive_string.h" + +#if ARCHIVE_VERSION_NUMBER < 3000000 +/* These disappear in libarchive 3.0 */ +/* Deprecated. */ +int +archive_api_feature(void) +{ + return (ARCHIVE_API_FEATURE); +} + +/* Deprecated. */ +int +archive_api_version(void) +{ + return (ARCHIVE_API_VERSION); +} + +/* Deprecated synonym for archive_version_number() */ +int +archive_version_stamp(void) +{ + return (archive_version_number()); +} + +/* Deprecated synonym for archive_version_string() */ +const char * +archive_version(void) +{ + return (archive_version_string()); +} +#endif + +int +archive_version_number(void) +{ + return (ARCHIVE_VERSION_NUMBER); +} + +const char * +archive_version_string(void) +{ + return (ARCHIVE_VERSION_STRING); +} + +int +archive_errno(struct archive *a) +{ + return (a->archive_error_number); +} + +const char * +archive_error_string(struct archive *a) +{ + + if (a->error != NULL && *a->error != '\0') + return (a->error); + else + return ("(Empty error message)"); +} + +int +archive_file_count(struct archive *a) +{ + return (a->file_count); +} + +int +archive_format(struct archive *a) +{ + return (a->archive_format); +} + +const char * +archive_format_name(struct archive *a) +{ + return (a->archive_format_name); +} + + +int +archive_compression(struct archive *a) +{ + return (a->compression_code); +} + +const char * +archive_compression_name(struct archive *a) +{ + return (a->compression_name); +} + + +/* + * Return a count of the number of compressed bytes processed. + */ +#ifndef __minix +int64_t +archive_position_compressed(struct archive *a) +{ + return (a->raw_position); +} +#else +off_t +archive_position_compressed(struct archive *a) +{ + return (a->raw_position); +} +#endif + +/* + * Return a count of the number of uncompressed bytes processed. + */ +#ifndef __minix +int64_t +archive_position_uncompressed(struct archive *a) +{ + return (a->file_position); +} +#else +off_t +archive_position_uncompressed(struct archive *a) +{ + return (a->file_position); +} +#endif +void +archive_clear_error(struct archive *a) +{ + archive_string_empty(&a->error_string); + a->error = NULL; +} + +void +archive_set_error(struct archive *a, int error_number, const char *fmt, ...) +{ + va_list ap; + + a->archive_error_number = error_number; + if (fmt == NULL) { + a->error = NULL; + return; + } + + va_start(ap, fmt); + archive_string_vsprintf(&(a->error_string), fmt, ap); + va_end(ap); + a->error = a->error_string.s; +} + +void +archive_copy_error(struct archive *dest, struct archive *src) +{ + dest->archive_error_number = src->archive_error_number; + + archive_string_copy(&dest->error_string, &src->error_string); + dest->error = dest->error_string.s; +} + +void +__archive_errx(int retvalue, const char *msg) +{ + static const char *msg1 = "Fatal Internal Error in libarchive: "; + size_t s; + + s = write(2, msg1, strlen(msg1)); + (void)s; /* UNUSED */ + s = write(2, msg, strlen(msg)); + (void)s; /* UNUSED */ + s = write(2, "\n", 1); + (void)s; /* UNUSED */ + exit(retvalue); +} + +/* + * Parse option strings + * Detail of option format. + * - The option can accept: + * "opt-name", "!opt-name", "opt-name=value". + * + * - The option entries are separated by comma. + * e.g "compression=9,opt=XXX,opt-b=ZZZ" + * + * - The name of option string consist of '-' and alphabet + * but character '-' cannot be used for the first character. + * (Regular expression is [a-z][-a-z]+) + * + * - For a specfic format/filter, using the format name with ':'. + * e.g "zip:compression=9" + * (This "compression=9" option entry is for "zip" format only) + * + * If another entries follow it, those are not for + * the specfic format/filter. + * e.g handle "zip:compression=9,opt=XXX,opt-b=ZZZ" + * "zip" format/filter handler will get "compression=9" + * all format/filter handler will get "opt=XXX" + * all format/filter handler will get "opt-b=ZZZ" + * + * - Whitespace and tab are bypassed. + * + */ +int +__archive_parse_options(const char *p, const char *fn, int keysize, char *key, + int valsize, char *val) +{ + const char *p_org; + int apply; + int kidx, vidx; + int negative; + enum { + /* Requested for initialization. */ + INIT, + /* Finding format/filter-name and option-name. */ + F_BOTH, + /* Finding option-name only. + * (already detected format/filter-name) */ + F_NAME, + /* Getting option-value. */ + G_VALUE + } state; + + p_org = p; + state = INIT; + kidx = vidx = negative = 0; + apply = 1; + while (*p) { + switch (state) { + case INIT: + kidx = vidx = 0; + negative = 0; + apply = 1; + state = F_BOTH; + break; + case F_BOTH: + case F_NAME: + if ((*p >= 'a' && *p <= 'z') || + (*p >= '0' && *p <= '9') || *p == '-') { + if (kidx == 0 && !(*p >= 'a' && *p <= 'z')) + /* Illegal sequence. */ + return (-1); + if (kidx >= keysize -1) + /* Too many characters. */ + return (-1); + key[kidx++] = *p++; + } else if (*p == '!') { + if (kidx != 0) + /* Illegal sequence. */ + return (-1); + negative = 1; + ++p; + } else if (*p == ',') { + if (kidx == 0) + /* Illegal sequence. */ + return (-1); + if (!negative) + val[vidx++] = '1'; + /* We have got boolean option data. */ + ++p; + if (apply) + goto complete; + else + /* This option does not apply to the + * format which the fn variable + * indicate. */ + state = INIT; + } else if (*p == ':') { + /* obuf data is format name */ + if (state == F_NAME) + /* We already found it. */ + return (-1); + if (kidx == 0) + /* Illegal sequence. */ + return (-1); + if (negative) + /* We cannot accept "!format-name:". */ + return (-1); + key[kidx] = '\0'; + if (strcmp(fn, key) != 0) + /* This option does not apply to the + * format which the fn variable + * indicate. */ + apply = 0; + kidx = 0; + ++p; + state = F_NAME; + } else if (*p == '=') { + if (kidx == 0) + /* Illegal sequence. */ + return (-1); + if (negative) + /* We cannot accept "!opt-name=value". */ + return (-1); + ++p; + state = G_VALUE; + } else if (*p == ' ') { + /* Pass the space character */ + ++p; + } else { + /* Illegal character. */ + return (-1); + } + break; + case G_VALUE: + if (*p == ',') { + if (vidx == 0) + /* Illegal sequence. */ + return (-1); + /* We have got option data. */ + ++p; + if (apply) + goto complete; + else + /* This option does not apply to the + * format which the fn variable + * indicate. */ + state = INIT; + } else if (*p == ' ') { + /* Pass the space character */ + ++p; + } else { + if (vidx >= valsize -1) + /* Too many characters. */ + return (-1); + val[vidx++] = *p++; + } + break; + } + } + + switch (state) { + case F_BOTH: + case F_NAME: + if (kidx != 0) { + if (!negative) + val[vidx++] = '1'; + /* We have got boolean option. */ + if (apply) + /* This option apply to the format which the + * fn variable indicate. */ + goto complete; + } + break; + case G_VALUE: + if (vidx == 0) + /* Illegal sequence. */ + return (-1); + /* We have got option value. */ + if (apply) + /* This option apply to the format which the fn + * variable indicate. */ + goto complete; + break; + case INIT:/* nothing */ + break; + } + + /* End of Option string. */ + return (0); + +complete: + key[kidx] = '\0'; + val[vidx] = '\0'; + /* Return a size which we've consumed for detecting option */ + return ((int)(p - p_org)); +} diff --git a/lib/libarchive/archive_virtual.c b/lib/libarchive/archive_virtual.c new file mode 100644 index 000000000..a5c0b39b9 --- /dev/null +++ b/lib/libarchive/archive_virtual.c @@ -0,0 +1,94 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_virtual.c 201098 2009-12-28 02:58:14Z kientzle $"); + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" + +int +archive_write_close(struct archive *a) +{ + return ((a->vtable->archive_close)(a)); +} + +int +archive_read_close(struct archive *a) +{ + return ((a->vtable->archive_close)(a)); +} + +#if ARCHIVE_API_VERSION > 1 +int +archive_write_finish(struct archive *a) +{ + return ((a->vtable->archive_finish)(a)); +} +#else +/* Temporarily allow library to compile with either 1.x or 2.0 API. */ +void +archive_write_finish(struct archive *a) +{ + (void)(a->vtable->archive_finish)(a); +} +#endif + +int +archive_read_finish(struct archive *a) +{ + return ((a->vtable->archive_finish)(a)); +} + +int +archive_write_header(struct archive *a, struct archive_entry *entry) +{ + ++a->file_count; + return ((a->vtable->archive_write_header)(a, entry)); +} + +int +archive_write_finish_entry(struct archive *a) +{ + return ((a->vtable->archive_write_finish_entry)(a)); +} + +#if ARCHIVE_API_VERSION > 1 +ssize_t +#else +/* Temporarily allow library to compile with either 1.x or 2.0 API. */ +int +#endif +archive_write_data(struct archive *a, const void *buff, size_t s) +{ + return ((a->vtable->archive_write_data)(a, buff, s)); +} + +ssize_t +archive_write_data_block(struct archive *a, const void *buff, size_t s, off_t o) +{ + return ((a->vtable->archive_write_data_block)(a, buff, s, o)); +} diff --git a/lib/libarchive/archive_write.3 b/lib/libarchive/archive_write.3 new file mode 100644 index 000000000..ffe0c9b45 --- /dev/null +++ b/lib/libarchive/archive_write.3 @@ -0,0 +1,629 @@ +.\" Copyright (c) 2003-2007 Tim Kientzle +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD: head/lib/libarchive/archive_write.3 201110 2009-12-28 03:31:29Z kientzle $ +.\" +.Dd May 11, 2008 +.Dt archive_write 3 +.Os +.Sh NAME +.Nm archive_write_new , +.Nm archive_write_set_format_cpio , +.Nm archive_write_set_format_pax , +.Nm archive_write_set_format_pax_restricted , +.Nm archive_write_set_format_shar , +.Nm archive_write_set_format_shar_binary , +.Nm archive_write_set_format_ustar , +.Nm archive_write_get_bytes_per_block , +.Nm archive_write_set_bytes_per_block , +.Nm archive_write_set_bytes_in_last_block , +.Nm archive_write_set_compression_bzip2 , +.Nm archive_write_set_compression_compress , +.Nm archive_write_set_compression_gzip , +.Nm archive_write_set_compression_none , +.Nm archive_write_set_compression_program , +.Nm archive_write_set_compressor_options , +.Nm archive_write_set_format_options , +.Nm archive_write_set_options , +.Nm archive_write_open , +.Nm archive_write_open_fd , +.Nm archive_write_open_FILE , +.Nm archive_write_open_filename , +.Nm archive_write_open_memory , +.Nm archive_write_header , +.Nm archive_write_data , +.Nm archive_write_finish_entry , +.Nm archive_write_close , +.Nm archive_write_finish +.Nd functions for creating archives +.Sh SYNOPSIS +.In archive.h +.Ft struct archive * +.Fn archive_write_new "void" +.Ft int +.Fn archive_write_get_bytes_per_block "struct archive *" +.Ft int +.Fn archive_write_set_bytes_per_block "struct archive *" "int bytes_per_block" +.Ft int +.Fn archive_write_set_bytes_in_last_block "struct archive *" "int" +.Ft int +.Fn archive_write_set_compression_bzip2 "struct archive *" +.Ft int +.Fn archive_write_set_compression_compress "struct archive *" +.Ft int +.Fn archive_write_set_compression_gzip "struct archive *" +.Ft int +.Fn archive_write_set_compression_none "struct archive *" +.Ft int +.Fn archive_write_set_compression_program "struct archive *" "const char * cmd" +.Ft int +.Fn archive_write_set_format_cpio "struct archive *" +.Ft int +.Fn archive_write_set_format_pax "struct archive *" +.Ft int +.Fn archive_write_set_format_pax_restricted "struct archive *" +.Ft int +.Fn archive_write_set_format_shar "struct archive *" +.Ft int +.Fn archive_write_set_format_shar_binary "struct archive *" +.Ft int +.Fn archive_write_set_format_ustar "struct archive *" +.Ft int +.Fn archive_write_set_format_options "struct archive *" "const char *" +.Ft int +.Fn archive_write_set_compressor_options "struct archive *" "const char *" +.Ft int +.Fn archive_write_set_options "struct archive *" "const char *" +.Ft int +.Fo archive_write_open +.Fa "struct archive *" +.Fa "void *client_data" +.Fa "archive_open_callback *" +.Fa "archive_write_callback *" +.Fa "archive_close_callback *" +.Fc +.Ft int +.Fn archive_write_open_fd "struct archive *" "int fd" +.Ft int +.Fn archive_write_open_FILE "struct archive *" "FILE *file" +.Ft int +.Fn archive_write_open_filename "struct archive *" "const char *filename" +.Ft int +.Fo archive_write_open_memory +.Fa "struct archive *" +.Fa "void *buffer" +.Fa "size_t bufferSize" +.Fa "size_t *outUsed" +.Fc +.Ft int +.Fn archive_write_header "struct archive *" "struct archive_entry *" +.Ft ssize_t +.Fn archive_write_data "struct archive *" "const void *" "size_t" +.Ft int +.Fn archive_write_finish_entry "struct archive *" +.Ft int +.Fn archive_write_close "struct archive *" +.Ft int +.Fn archive_write_finish "struct archive *" +.Sh DESCRIPTION +These functions provide a complete API for creating streaming +archive files. +The general process is to first create the +.Tn struct archive +object, set any desired options, initialize the archive, append entries, then +close the archive and release all resources. +The following summary describes the functions in approximately +the order they are ordinarily used: +.Bl -tag -width indent +.It Fn archive_write_new +Allocates and initializes a +.Tn struct archive +object suitable for writing a tar archive. +.It Fn archive_write_set_bytes_per_block +Sets the block size used for writing the archive data. +Every call to the write callback function, except possibly the last one, will +use this value for the length. +The third parameter is a boolean that specifies whether or not the final block +written will be padded to the full block size. +If it is zero, the last block will not be padded. +If it is non-zero, padding will be added both before and after compression. +The default is to use a block size of 10240 bytes and to pad the last block. +Note that a block size of zero will suppress internal blocking +and cause writes to be sent directly to the write callback as they occur. +.It Fn archive_write_get_bytes_per_block +Retrieve the block size to be used for writing. +A value of -1 here indicates that the library should use default values. +A value of zero indicates that internal blocking is suppressed. +.It Fn archive_write_set_bytes_in_last_block +Sets the block size used for writing the last block. +If this value is zero, the last block will be padded to the same size +as the other blocks. +Otherwise, the final block will be padded to a multiple of this size. +In particular, setting it to 1 will cause the final block to not be padded. +For compressed output, any padding generated by this option +is applied only after the compression. +The uncompressed data is always unpadded. +The default is to pad the last block to the full block size (note that +.Fn archive_write_open_filename +will set this based on the file type). +Unlike the other +.Dq set +functions, this function can be called after the archive is opened. +.It Fn archive_write_get_bytes_in_last_block +Retrieve the currently-set value for last block size. +A value of -1 here indicates that the library should use default values. +.It Xo +.Fn archive_write_set_format_cpio , +.Fn archive_write_set_format_pax , +.Fn archive_write_set_format_pax_restricted , +.Fn archive_write_set_format_shar , +.Fn archive_write_set_format_shar_binary , +.Fn archive_write_set_format_ustar +.Xc +Sets the format that will be used for the archive. +The library can write +POSIX octet-oriented cpio format archives, +POSIX-standard +.Dq pax interchange +format archives, +traditional +.Dq shar +archives, +enhanced +.Dq binary +shar archives that store a variety of file attributes and handle binary files, +and +POSIX-standard +.Dq ustar +archives. +The pax interchange format is a backwards-compatible tar format that +adds key/value attributes to each entry and supports arbitrary +filenames, linknames, uids, sizes, etc. +.Dq Restricted pax interchange format +is the library default; this is the same as pax format, but suppresses +the pax extended header for most normal files. +In most cases, this will result in ordinary ustar archives. +.It Xo +.Fn archive_write_set_compression_bzip2 , +.Fn archive_write_set_compression_compress , +.Fn archive_write_set_compression_gzip , +.Fn archive_write_set_compression_none +.Xc +The resulting archive will be compressed as specified. +Note that the compressed output is always properly blocked. +.It Fn archive_write_set_compression_program +The archive will be fed into the specified compression program. +The output of that program is blocked and written to the client +write callbacks. +.It Xo +.Fn archive_write_set_compressor_options , +.Fn archive_write_set_format_options , +.Fn archive_write_set_options +.Xc +Specifies options that will be passed to the currently-enabled +compressor and/or format writer. +The argument is a comma-separated list of individual options. +Individual options have one of the following forms: +.Bl -tag -compact -width indent +.It Ar option=value +The option/value pair will be provided to every module. +Modules that do not accept an option with this name will ignore it. +.It Ar option +The option will be provided to every module with a value of +.Dq 1 . +.It Ar !option +The option will be provided to every module with a NULL value. +.It Ar module:option=value , Ar module:option , Ar module:!option +As above, but the corresponding option and value will be provided +only to modules whose name matches +.Ar module . +.El +The return value will be +.Cm ARCHIVE_OK +if any module accepts the option, or +.Cm ARCHIVE_WARN +if no module accepted the option, or +.Cm ARCHIVE_FATAL +if there was a fatal error while attempting to process the option. +.Pp +The currently supported options are: +.Bl -tag -compact -width indent +.It Compressor gzip +.Bl -tag -compact -width indent +.It Cm compression-level +The value is interpreted as a decimal integer specifying the +gzip compression level. +.El +.It Compressor xz +.Bl -tag -compact -width indent +.It Cm compression-level +The value is interpreted as a decimal integer specifying the +compression level. +.El +.It Format mtree +.Bl -tag -compact -width indent +.It Cm cksum , Cm device , Cm flags , Cm gid , Cm gname , Cm indent , Cm link , Cm md5 , Cm mode , Cm nlink , Cm rmd160 , Cm sha1 , Cm sha256 , Cm sha384 , Cm sha512 , Cm size , Cm time , Cm uid , Cm uname +Enable a particular keyword in the mtree output. +Prefix with an exclamation mark to disable the corresponding keyword. +The default is equivalent to +.Dq device, flags, gid, gname, link, mode, nlink, size, time, type, uid, uname . +.It Cm all +Enables all of the above keywords. +.It Cm use-set +Enables generation of +.Cm /set +lines that specify default values for the following files and/or directories. +.It Cm indent +XXX needs explanation XXX +.El +.El +.It Fn archive_write_open +Freeze the settings, open the archive, and prepare for writing entries. +This is the most generic form of this function, which accepts +pointers to three callback functions which will be invoked by +the compression layer to write the constructed archive. +.It Fn archive_write_open_fd +A convenience form of +.Fn archive_write_open +that accepts a file descriptor. +The +.Fn archive_write_open_fd +function is safe for use with tape drives or other +block-oriented devices. +.It Fn archive_write_open_FILE +A convenience form of +.Fn archive_write_open +that accepts a +.Ft "FILE *" +pointer. +Note that +.Fn archive_write_open_FILE +is not safe for writing to tape drives or other devices +that require correct blocking. +.It Fn archive_write_open_file +A deprecated synonym for +.Fn archive_write_open_filename . +.It Fn archive_write_open_filename +A convenience form of +.Fn archive_write_open +that accepts a filename. +A NULL argument indicates that the output should be written to standard output; +an argument of +.Dq - +will open a file with that name. +If you have not invoked +.Fn archive_write_set_bytes_in_last_block , +then +.Fn archive_write_open_filename +will adjust the last-block padding depending on the file: +it will enable padding when writing to standard output or +to a character or block device node, it will disable padding otherwise. +You can override this by manually invoking +.Fn archive_write_set_bytes_in_last_block +before calling +.Fn archive_write_open . +The +.Fn archive_write_open_filename +function is safe for use with tape drives or other +block-oriented devices. +.It Fn archive_write_open_memory +A convenience form of +.Fn archive_write_open +that accepts a pointer to a block of memory that will receive +the archive. +The final +.Ft "size_t *" +argument points to a variable that will be updated +after each write to reflect how much of the buffer +is currently in use. +You should be careful to ensure that this variable +remains allocated until after the archive is +closed. +.It Fn archive_write_header +Build and write a header using the data in the provided +.Tn struct archive_entry +structure. +See +.Xr archive_entry 3 +for information on creating and populating +.Tn struct archive_entry +objects. +.It Fn archive_write_data +Write data corresponding to the header just written. +Returns number of bytes written or -1 on error. +.It Fn archive_write_finish_entry +Close out the entry just written. +In particular, this writes out the final padding required by some formats. +Ordinarily, clients never need to call this, as it +is called automatically by +.Fn archive_write_next_header +and +.Fn archive_write_close +as needed. +.It Fn archive_write_close +Complete the archive and invoke the close callback. +.It Fn archive_write_finish +Invokes +.Fn archive_write_close +if it was not invoked manually, then releases all resources. +Note that this function was declared to return +.Ft void +in libarchive 1.x, which made it impossible to detect errors when +.Fn archive_write_close +was invoked implicitly from this function. +This is corrected beginning with libarchive 2.0. +.El +More information about the +.Va struct archive +object and the overall design of the library can be found in the +.Xr libarchive 3 +overview. +.Sh IMPLEMENTATION +Compression support is built-in to libarchive, which uses zlib and bzlib +to handle gzip and bzip2 compression, respectively. +.Sh CLIENT CALLBACKS +To use this library, you will need to define and register +callback functions that will be invoked to write data to the +resulting archive. +These functions are registered by calling +.Fn archive_write_open : +.Bl -item -offset indent +.It +.Ft typedef int +.Fn archive_open_callback "struct archive *" "void *client_data" +.El +.Pp +The open callback is invoked by +.Fn archive_write_open . +It should return +.Cm ARCHIVE_OK +if the underlying file or data source is successfully +opened. +If the open fails, it should call +.Fn archive_set_error +to register an error code and message and return +.Cm ARCHIVE_FATAL . +.Bl -item -offset indent +.It +.Ft typedef ssize_t +.Fo archive_write_callback +.Fa "struct archive *" +.Fa "void *client_data" +.Fa "const void *buffer" +.Fa "size_t length" +.Fc +.El +.Pp +The write callback is invoked whenever the library +needs to write raw bytes to the archive. +For correct blocking, each call to the write callback function +should translate into a single +.Xr write 2 +system call. +This is especially critical when writing archives to tape drives. +On success, the write callback should return the +number of bytes actually written. +On error, the callback should invoke +.Fn archive_set_error +to register an error code and message and return -1. +.Bl -item -offset indent +.It +.Ft typedef int +.Fn archive_close_callback "struct archive *" "void *client_data" +.El +.Pp +The close callback is invoked by archive_close when +the archive processing is complete. +The callback should return +.Cm ARCHIVE_OK +on success. +On failure, the callback should invoke +.Fn archive_set_error +to register an error code and message and +return +.Cm ARCHIVE_FATAL. +.Sh EXAMPLE +The following sketch illustrates basic usage of the library. +In this example, +the callback functions are simply wrappers around the standard +.Xr open 2 , +.Xr write 2 , +and +.Xr close 2 +system calls. +.Bd -literal -offset indent +#ifdef __linux__ +#define _FILE_OFFSET_BITS 64 +#endif +#include +#include +#include +#include +#include +#include + +struct mydata { + const char *name; + int fd; +}; + +int +myopen(struct archive *a, void *client_data) +{ + struct mydata *mydata = client_data; + + mydata->fd = open(mydata->name, O_WRONLY | O_CREAT, 0644); + if (mydata->fd >= 0) + return (ARCHIVE_OK); + else + return (ARCHIVE_FATAL); +} + +ssize_t +mywrite(struct archive *a, void *client_data, const void *buff, size_t n) +{ + struct mydata *mydata = client_data; + + return (write(mydata->fd, buff, n)); +} + +int +myclose(struct archive *a, void *client_data) +{ + struct mydata *mydata = client_data; + + if (mydata->fd > 0) + close(mydata->fd); + return (0); +} + +void +write_archive(const char *outname, const char **filename) +{ + struct mydata *mydata = malloc(sizeof(struct mydata)); + struct archive *a; + struct archive_entry *entry; + struct stat st; + char buff[8192]; + int len; + int fd; + + a = archive_write_new(); + mydata->name = outname; + archive_write_set_compression_gzip(a); + archive_write_set_format_ustar(a); + archive_write_open(a, mydata, myopen, mywrite, myclose); + while (*filename) { + stat(*filename, &st); + entry = archive_entry_new(); + archive_entry_copy_stat(entry, &st); + archive_entry_set_pathname(entry, *filename); + archive_write_header(a, entry); + fd = open(*filename, O_RDONLY); + len = read(fd, buff, sizeof(buff)); + while ( len > 0 ) { + archive_write_data(a, buff, len); + len = read(fd, buff, sizeof(buff)); + } + archive_entry_free(entry); + filename++; + } + archive_write_finish(a); +} + +int main(int argc, const char **argv) +{ + const char *outname; + argv++; + outname = argv++; + write_archive(outname, argv); + return 0; +} +.Ed +.Sh RETURN VALUES +Most functions return +.Cm ARCHIVE_OK +(zero) on success, or one of several non-zero +error codes for errors. +Specific error codes include: +.Cm ARCHIVE_RETRY +for operations that might succeed if retried, +.Cm ARCHIVE_WARN +for unusual conditions that do not prevent further operations, and +.Cm ARCHIVE_FATAL +for serious errors that make remaining operations impossible. +The +.Fn archive_errno +and +.Fn archive_error_string +functions can be used to retrieve an appropriate error code and a +textual error message. +.Pp +.Fn archive_write_new +returns a pointer to a newly-allocated +.Tn struct archive +object. +.Pp +.Fn archive_write_data +returns a count of the number of bytes actually written. +On error, -1 is returned and the +.Fn archive_errno +and +.Fn archive_error_string +functions will return appropriate values. +Note that if the client-provided write callback function +returns a non-zero value, that error will be propagated back to the caller +through whatever API function resulted in that call, which +may include +.Fn archive_write_header , +.Fn archive_write_data , +.Fn archive_write_close , +or +.Fn archive_write_finish . +The client callback can call +.Fn archive_set_error +to provide values that can then be retrieved by +.Fn archive_errno +and +.Fn archive_error_string . +.Sh SEE ALSO +.Xr tar 1 , +.Xr libarchive 3 , +.Xr tar 5 +.Sh HISTORY +The +.Nm libarchive +library first appeared in +.Fx 5.3 . +.Sh AUTHORS +.An -nosplit +The +.Nm libarchive +library was written by +.An Tim Kientzle Aq kientzle@acm.org . +.Sh BUGS +There are many peculiar bugs in historic tar implementations that may cause +certain programs to reject archives written by this library. +For example, several historic implementations calculated header checksums +incorrectly and will thus reject valid archives; GNU tar does not fully support +pax interchange format; some old tar implementations required specific +field terminations. +.Pp +The default pax interchange format eliminates most of the historic +tar limitations and provides a generic key/value attribute facility +for vendor-defined extensions. +One oversight in POSIX is the failure to provide a standard attribute +for large device numbers. +This library uses +.Dq SCHILY.devminor +and +.Dq SCHILY.devmajor +for device numbers that exceed the range supported by the backwards-compatible +ustar header. +These keys are compatible with Joerg Schilling's +.Nm star +archiver. +Other implementations may not recognize these keys and will thus be unable +to correctly restore device nodes with large device numbers from archives +created by this library. diff --git a/lib/libarchive/archive_write.c b/lib/libarchive/archive_write.c new file mode 100644 index 000000000..8ed71de59 --- /dev/null +++ b/lib/libarchive/archive_write.c @@ -0,0 +1,476 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_write.c 201099 2009-12-28 03:03:00Z kientzle $"); + +/* + * This file contains the "essential" portions of the write API, that + * is, stuff that will essentially always be used by any client that + * actually needs to write a archive. Optional pieces have been, as + * far as possible, separated out into separate files to reduce + * needlessly bloating statically-linked clients. + */ + +#ifdef HAVE_SYS_WAIT_H +#include +#endif +#ifdef HAVE_LIMITS_H +#include +#endif +#include +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#include +#ifdef HAVE_UNISTD_H +#include +#endif + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" +#include "archive_write_private.h" + +static struct archive_vtable *archive_write_vtable(void); + +static int _archive_write_close(struct archive *); +static int _archive_write_finish(struct archive *); +static int _archive_write_header(struct archive *, struct archive_entry *); +static int _archive_write_finish_entry(struct archive *); +static ssize_t _archive_write_data(struct archive *, const void *, size_t); + +static struct archive_vtable * +archive_write_vtable(void) +{ + static struct archive_vtable av; + static int inited = 0; + + if (!inited) { + av.archive_close = _archive_write_close; + av.archive_finish = _archive_write_finish; + av.archive_write_header = _archive_write_header; + av.archive_write_finish_entry = _archive_write_finish_entry; + av.archive_write_data = _archive_write_data; + } + return (&av); +} + +/* + * Allocate, initialize and return an archive object. + */ +struct archive * +archive_write_new(void) +{ + struct archive_write *a; + unsigned char *nulls; + + a = (struct archive_write *)malloc(sizeof(*a)); + if (a == NULL) + return (NULL); + memset(a, 0, sizeof(*a)); + a->archive.magic = ARCHIVE_WRITE_MAGIC; + a->archive.state = ARCHIVE_STATE_NEW; + a->archive.vtable = archive_write_vtable(); + /* + * The value 10240 here matches the traditional tar default, + * but is otherwise arbitrary. + * TODO: Set the default block size from the format selected. + */ + a->bytes_per_block = 10240; + a->bytes_in_last_block = -1; /* Default */ + + /* Initialize a block of nulls for padding purposes. */ + a->null_length = 1024; + nulls = (unsigned char *)malloc(a->null_length); + if (nulls == NULL) { + free(a); + return (NULL); + } + memset(nulls, 0, a->null_length); + a->nulls = nulls; + /* + * Set default compression, but don't set a default format. + * Were we to set a default format here, we would force every + * client to link in support for that format, even if they didn't + * ever use it. + */ + archive_write_set_compression_none(&a->archive); + return (&a->archive); +} + +/* + * Set write options for the format. Returns 0 if successful. + */ +int +archive_write_set_format_options(struct archive *_a, const char *s) +{ + struct archive_write *a = (struct archive_write *)_a; + char key[64], val[64]; + int len, r, ret = ARCHIVE_OK; + + __archive_check_magic(&a->archive, ARCHIVE_WRITE_MAGIC, + ARCHIVE_STATE_NEW, "archive_write_set_format_options"); + archive_clear_error(&a->archive); + + if (s == NULL || *s == '\0') + return (ARCHIVE_OK); + if (a->format_options == NULL) + /* This format does not support option. */ + return (ARCHIVE_OK); + + while ((len = __archive_parse_options(s, a->format_name, + sizeof(key), key, sizeof(val), val)) > 0) { + if (val[0] == '\0') + r = a->format_options(a, key, NULL); + else + r = a->format_options(a, key, val); + if (r == ARCHIVE_FATAL) + return (r); + if (r < ARCHIVE_OK) { /* This key was not handled. */ + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Unsupported option ``%s''", key); + ret = ARCHIVE_WARN; + } + s += len; + } + if (len < 0) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Malformed options string."); + return (ARCHIVE_WARN); + } + return (ret); +} + +/* + * Set write options for the compressor. Returns 0 if successful. + */ +int +archive_write_set_compressor_options(struct archive *_a, const char *s) +{ + struct archive_write *a = (struct archive_write *)_a; + char key[64], val[64]; + int len, r; + int ret = ARCHIVE_OK; + + __archive_check_magic(&a->archive, ARCHIVE_WRITE_MAGIC, + ARCHIVE_STATE_NEW, "archive_write_set_compressor_options"); + archive_clear_error(&a->archive); + + if (s == NULL || *s == '\0') + return (ARCHIVE_OK); + if (a->compressor.options == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Unsupported option ``%s''", s); + /* This compressor does not support option. */ + return (ARCHIVE_WARN); + } + + while ((len = __archive_parse_options(s, a->archive.compression_name, + sizeof(key), key, sizeof(val), val)) > 0) { + if (val[0] == '\0') + r = a->compressor.options(a, key, NULL); + else + r = a->compressor.options(a, key, val); + if (r == ARCHIVE_FATAL) + return (r); + if (r < ARCHIVE_OK) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Unsupported option ``%s''", key); + ret = ARCHIVE_WARN; + } + s += len; + } + if (len < 0) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Illegal format options."); + return (ARCHIVE_WARN); + } + return (ret); +} + +/* + * Set write options for the format and the compressor. Returns 0 if successful. + */ +int +archive_write_set_options(struct archive *_a, const char *s) +{ + int r1, r2; + + r1 = archive_write_set_format_options(_a, s); + if (r1 < ARCHIVE_WARN) + return (r1); + r2 = archive_write_set_compressor_options(_a, s); + if (r2 < ARCHIVE_WARN) + return (r2); + if (r1 == ARCHIVE_WARN && r2 == ARCHIVE_WARN) + return (ARCHIVE_WARN); + return (ARCHIVE_OK); +} + +/* + * Set the block size. Returns 0 if successful. + */ +int +archive_write_set_bytes_per_block(struct archive *_a, int bytes_per_block) +{ + struct archive_write *a = (struct archive_write *)_a; + __archive_check_magic(&a->archive, ARCHIVE_WRITE_MAGIC, + ARCHIVE_STATE_NEW, "archive_write_set_bytes_per_block"); + a->bytes_per_block = bytes_per_block; + return (ARCHIVE_OK); +} + +/* + * Get the current block size. -1 if it has never been set. + */ +int +archive_write_get_bytes_per_block(struct archive *_a) +{ + struct archive_write *a = (struct archive_write *)_a; + __archive_check_magic(&a->archive, ARCHIVE_WRITE_MAGIC, + ARCHIVE_STATE_ANY, "archive_write_get_bytes_per_block"); + return (a->bytes_per_block); +} + +/* + * Set the size for the last block. + * Returns 0 if successful. + */ +int +archive_write_set_bytes_in_last_block(struct archive *_a, int bytes) +{ + struct archive_write *a = (struct archive_write *)_a; + __archive_check_magic(&a->archive, ARCHIVE_WRITE_MAGIC, + ARCHIVE_STATE_ANY, "archive_write_set_bytes_in_last_block"); + a->bytes_in_last_block = bytes; + return (ARCHIVE_OK); +} + +/* + * Return the value set above. -1 indicates it has not been set. + */ +int +archive_write_get_bytes_in_last_block(struct archive *_a) +{ + struct archive_write *a = (struct archive_write *)_a; + __archive_check_magic(&a->archive, ARCHIVE_WRITE_MAGIC, + ARCHIVE_STATE_ANY, "archive_write_get_bytes_in_last_block"); + return (a->bytes_in_last_block); +} + + +/* + * dev/ino of a file to be rejected. Used to prevent adding + * an archive to itself recursively. + */ +int +archive_write_set_skip_file(struct archive *_a, dev_t d, ino_t i) +{ + struct archive_write *a = (struct archive_write *)_a; + __archive_check_magic(&a->archive, ARCHIVE_WRITE_MAGIC, + ARCHIVE_STATE_ANY, "archive_write_set_skip_file"); + a->skip_file_dev = d; + a->skip_file_ino = i; + return (ARCHIVE_OK); +} + + +/* + * Open the archive using the current settings. + */ +int +archive_write_open(struct archive *_a, void *client_data, + archive_open_callback *opener, archive_write_callback *writer, + archive_close_callback *closer) +{ + struct archive_write *a = (struct archive_write *)_a; + int ret; + + __archive_check_magic(&a->archive, ARCHIVE_WRITE_MAGIC, + ARCHIVE_STATE_NEW, "archive_write_open"); + archive_clear_error(&a->archive); + a->archive.state = ARCHIVE_STATE_HEADER; + a->client_data = client_data; + a->client_writer = writer; + a->client_opener = opener; + a->client_closer = closer; + ret = (a->compressor.init)(a); + if (a->format_init && ret == ARCHIVE_OK) + ret = (a->format_init)(a); + return (ret); +} + + +/* + * Close out the archive. + * + * Be careful: user might just call write_new and then write_finish. + * Don't assume we actually wrote anything or performed any non-trivial + * initialization. + */ +static int +_archive_write_close(struct archive *_a) +{ + struct archive_write *a = (struct archive_write *)_a; + int r = ARCHIVE_OK, r1 = ARCHIVE_OK; + + __archive_check_magic(&a->archive, ARCHIVE_WRITE_MAGIC, + ARCHIVE_STATE_ANY, "archive_write_close"); + + /* Finish the last entry. */ + if (a->archive.state & ARCHIVE_STATE_DATA) + r = ((a->format_finish_entry)(a)); + + /* Finish off the archive. */ + if (a->format_finish != NULL) { + r1 = (a->format_finish)(a); + if (r1 < r) + r = r1; + } + + /* Release format resources. */ + if (a->format_destroy != NULL) { + r1 = (a->format_destroy)(a); + if (r1 < r) + r = r1; + } + + /* Finish the compression and close the stream. */ + if (a->compressor.finish != NULL) { + r1 = (a->compressor.finish)(a); + if (r1 < r) + r = r1; + } + + /* Close out the client stream. */ + if (a->client_closer != NULL) { + r1 = (a->client_closer)(&a->archive, a->client_data); + if (r1 < r) + r = r1; + } + + a->archive.state = ARCHIVE_STATE_CLOSED; + return (r); +} + +/* + * Destroy the archive structure. + */ +static int +_archive_write_finish(struct archive *_a) +{ + struct archive_write *a = (struct archive_write *)_a; + int r = ARCHIVE_OK; + + __archive_check_magic(&a->archive, ARCHIVE_WRITE_MAGIC, + ARCHIVE_STATE_ANY, "archive_write_finish"); + if (a->archive.state != ARCHIVE_STATE_CLOSED) + r = archive_write_close(&a->archive); + + /* Release various dynamic buffers. */ + free((void *)(uintptr_t)(const void *)a->nulls); + archive_string_free(&a->archive.error_string); + a->archive.magic = 0; + free(a); + return (r); +} + +/* + * Write the appropriate header. + */ +static int +_archive_write_header(struct archive *_a, struct archive_entry *entry) +{ + struct archive_write *a = (struct archive_write *)_a; + int ret, r2; + + __archive_check_magic(&a->archive, ARCHIVE_WRITE_MAGIC, + ARCHIVE_STATE_DATA | ARCHIVE_STATE_HEADER, "archive_write_header"); + archive_clear_error(&a->archive); + + /* In particular, "retry" and "fatal" get returned immediately. */ + ret = archive_write_finish_entry(&a->archive); + if (ret < ARCHIVE_OK && ret != ARCHIVE_WARN) + return (ret); + +#ifndef __minix + if (a->skip_file_dev != 0 && + archive_entry_dev(entry) == a->skip_file_dev && + a->skip_file_ino != 0 && + archive_entry_ino64(entry) == a->skip_file_ino) { + archive_set_error(&a->archive, 0, + "Can't add archive to itself"); + return (ARCHIVE_FAILED); + } +#else + if (a->skip_file_dev != 0 && + archive_entry_dev(entry) == a->skip_file_dev && + a->skip_file_ino != 0 && + archive_entry_ino(entry) == a->skip_file_ino) { + archive_set_error(&a->archive, 0, + "Can't add archive to itself"); + return (ARCHIVE_FAILED); + } +#endif + /* Format and write header. */ + r2 = ((a->format_write_header)(a, entry)); + if (r2 < ret) + ret = r2; + + a->archive.state = ARCHIVE_STATE_DATA; + return (ret); +} + +static int +_archive_write_finish_entry(struct archive *_a) +{ + struct archive_write *a = (struct archive_write *)_a; + int ret = ARCHIVE_OK; + + __archive_check_magic(&a->archive, ARCHIVE_WRITE_MAGIC, + ARCHIVE_STATE_HEADER | ARCHIVE_STATE_DATA, + "archive_write_finish_entry"); + if (a->archive.state & ARCHIVE_STATE_DATA) + ret = (a->format_finish_entry)(a); + a->archive.state = ARCHIVE_STATE_HEADER; + return (ret); +} + +/* + * Note that the compressor is responsible for blocking. + */ +static ssize_t +_archive_write_data(struct archive *_a, const void *buff, size_t s) +{ + struct archive_write *a = (struct archive_write *)_a; + __archive_check_magic(&a->archive, ARCHIVE_WRITE_MAGIC, + ARCHIVE_STATE_DATA, "archive_write_data"); + archive_clear_error(&a->archive); + return ((a->format_write_data)(a, buff, s)); +} diff --git a/lib/libarchive/archive_write_disk.3 b/lib/libarchive/archive_write_disk.3 new file mode 100644 index 000000000..5ed4a5038 --- /dev/null +++ b/lib/libarchive/archive_write_disk.3 @@ -0,0 +1,375 @@ +.\" Copyright (c) 2003-2007 Tim Kientzle +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD: src/lib/libarchive/archive_write_disk.3,v 1.4 2008/09/04 05:22:00 kientzle Exp $ +.\" +.Dd August 5, 2008 +.Dt archive_write_disk 3 +.Os +.Sh NAME +.Nm archive_write_disk_new , +.Nm archive_write_disk_set_options , +.Nm archive_write_disk_set_skip_file , +.Nm archive_write_disk_set_group_lookup , +.Nm archive_write_disk_set_standard_lookup , +.Nm archive_write_disk_set_user_lookup , +.Nm archive_write_header , +.Nm archive_write_data , +.Nm archive_write_finish_entry , +.Nm archive_write_close , +.Nm archive_write_finish +.Nd functions for creating objects on disk +.Sh SYNOPSIS +.In archive.h +.Ft struct archive * +.Fn archive_write_disk_new "void" +.Ft int +.Fn archive_write_disk_set_options "struct archive *" "int flags" +.Ft int +.Fn archive_write_disk_set_skip_file "struct archive *" "dev_t" "ino_t" +.Ft int +.Fo archive_write_disk_set_group_lookup +.Fa "struct archive *" +.Fa "void *" +.Fa "gid_t (*)(void *, const char *gname, gid_t gid)" +.Fa "void (*cleanup)(void *)" +.Fc +.Ft int +.Fn archive_write_disk_set_standard_lookup "struct archive *" +.Ft int +.Fo archive_write_disk_set_user_lookup +.Fa "struct archive *" +.Fa "void *" +.Fa "uid_t (*)(void *, const char *uname, uid_t uid)" +.Fa "void (*cleanup)(void *)" +.Fc +.Ft int +.Fn archive_write_header "struct archive *" "struct archive_entry *" +.Ft ssize_t +.Fn archive_write_data "struct archive *" "const void *" "size_t" +.Ft int +.Fn archive_write_finish_entry "struct archive *" +.Ft int +.Fn archive_write_close "struct archive *" +.Ft int +.Fn archive_write_finish "struct archive *" +.Sh DESCRIPTION +These functions provide a complete API for creating objects on +disk from +.Tn struct archive_entry +descriptions. +They are most naturally used when extracting objects from an archive +using the +.Fn archive_read +interface. +The general process is to read +.Tn struct archive_entry +objects from an archive, then write those objects to a +.Tn struct archive +object created using the +.Fn archive_write_disk +family functions. +This interface is deliberately very similar to the +.Fn archive_write +interface used to write objects to a streaming archive. +.Bl -tag -width indent +.It Fn archive_write_disk_new +Allocates and initializes a +.Tn struct archive +object suitable for writing objects to disk. +.It Fn archive_write_disk_set_skip_file +Records the device and inode numbers of a file that should not be +overwritten. +This is typically used to ensure that an extraction process does not +overwrite the archive from which objects are being read. +This capability is technically unnecessary but can be a significant +performance optimization in practice. +.It Fn archive_write_disk_set_options +The options field consists of a bitwise OR of one or more of the +following values: +.Bl -tag -compact -width "indent" +.It Cm ARCHIVE_EXTRACT_OWNER +The user and group IDs should be set on the restored file. +By default, the user and group IDs are not restored. +.It Cm ARCHIVE_EXTRACT_PERM +Full permissions (including SGID, SUID, and sticky bits) should +be restored exactly as specified, without obeying the +current umask. +Note that SUID and SGID bits can only be restored if the +user and group ID of the object on disk are correct. +If +.Cm ARCHIVE_EXTRACT_OWNER +is not specified, then SUID and SGID bits will only be restored +if the default user and group IDs of newly-created objects on disk +happen to match those specified in the archive entry. +By default, only basic permissions are restored, and umask is obeyed. +.It Cm ARCHIVE_EXTRACT_TIME +The timestamps (mtime, ctime, and atime) should be restored. +By default, they are ignored. +Note that restoring of atime is not currently supported. +.It Cm ARCHIVE_EXTRACT_NO_OVERWRITE +Existing files on disk will not be overwritten. +By default, existing regular files are truncated and overwritten; +existing directories will have their permissions updated; +other pre-existing objects are unlinked and recreated from scratch. +.It Cm ARCHIVE_EXTRACT_UNLINK +Existing files on disk will be unlinked before any attempt to +create them. +In some cases, this can prove to be a significant performance improvement. +By default, existing files are truncated and rewritten, but +the file is not recreated. +In particular, the default behavior does not break existing hard links. +.It Cm ARCHIVE_EXTRACT_ACL +Attempt to restore ACLs. +By default, extended ACLs are ignored. +.It Cm ARCHIVE_EXTRACT_FFLAGS +Attempt to restore extended file flags. +By default, file flags are ignored. +.It Cm ARCHIVE_EXTRACT_XATTR +Attempt to restore POSIX.1e extended attributes. +By default, they are ignored. +.It Cm ARCHIVE_EXTRACT_SECURE_SYMLINKS +Refuse to extract any object whose final location would be altered +by a symlink on disk. +This is intended to help guard against a variety of mischief +caused by archives that (deliberately or otherwise) extract +files outside of the current directory. +The default is not to perform this check. +If +.Cm ARCHIVE_EXTRACT_UNLINK +is specified together with this option, the library will +remove any intermediate symlinks it finds and return an +error only if such symlink could not be removed. +.It Cm ARCHIVE_EXTRACT_SECURE_NODOTDOT +Refuse to extract a path that contains a +.Pa .. +element anywhere within it. +The default is to not refuse such paths. +Note that paths ending in +.Pa .. +always cause an error, regardless of this flag. +.It Cm ARCHIVE_EXTRACT_SPARSE +Scan data for blocks of NUL bytes and try to recreate them with holes. +This results in sparse files, independent of whether the archive format +supports or uses them. +.El +.It Xo +.Fn archive_write_disk_set_group_lookup , +.Fn archive_write_disk_set_user_lookup +.Xc +The +.Tn struct archive_entry +objects contain both names and ids that can be used to identify users +and groups. +These names and ids describe the ownership of the file itself and +also appear in ACL lists. +By default, the library uses the ids and ignores the names, but +this can be overridden by registering user and group lookup functions. +To register, you must provide a lookup function which +accepts both a name and id and returns a suitable id. +You may also provide a +.Tn void * +pointer to a private data structure and a cleanup function for +that data. +The cleanup function will be invoked when the +.Tn struct archive +object is destroyed. +.It Fn archive_write_disk_set_standard_lookup +This convenience function installs a standard set of user +and group lookup functions. +These functions use +.Xr getpwnam 3 +and +.Xr getgrnam 3 +to convert names to ids, defaulting to the ids if the names cannot +be looked up. +These functions also implement a simple memory cache to reduce +the number of calls to +.Xr getpwnam 3 +and +.Xr getgrnam 3 . +.It Fn archive_write_header +Build and write a header using the data in the provided +.Tn struct archive_entry +structure. +See +.Xr archive_entry 3 +for information on creating and populating +.Tn struct archive_entry +objects. +.It Fn archive_write_data +Write data corresponding to the header just written. +Returns number of bytes written or -1 on error. +.It Fn archive_write_finish_entry +Close out the entry just written. +Ordinarily, clients never need to call this, as it +is called automatically by +.Fn archive_write_next_header +and +.Fn archive_write_close +as needed. +.It Fn archive_write_close +Set any attributes that could not be set during the initial restore. +For example, directory timestamps are not restored initially because +restoring a subsequent file would alter that timestamp. +Similarly, non-writable directories are initially created with +write permissions (so that their contents can be restored). +The +.Nm +library maintains a list of all such deferred attributes and +sets them when this function is invoked. +.It Fn archive_write_finish +Invokes +.Fn archive_write_close +if it was not invoked manually, then releases all resources. +.El +More information about the +.Va struct archive +object and the overall design of the library can be found in the +.Xr libarchive 3 +overview. +Many of these functions are also documented under +.Xr archive_write 3 . +.Sh RETURN VALUES +Most functions return +.Cm ARCHIVE_OK +(zero) on success, or one of several non-zero +error codes for errors. +Specific error codes include: +.Cm ARCHIVE_RETRY +for operations that might succeed if retried, +.Cm ARCHIVE_WARN +for unusual conditions that do not prevent further operations, and +.Cm ARCHIVE_FATAL +for serious errors that make remaining operations impossible. +The +.Fn archive_errno +and +.Fn archive_error_string +functions can be used to retrieve an appropriate error code and a +textual error message. +.Pp +.Fn archive_write_disk_new +returns a pointer to a newly-allocated +.Tn struct archive +object. +.Pp +.Fn archive_write_data +returns a count of the number of bytes actually written. +On error, -1 is returned and the +.Fn archive_errno +and +.Fn archive_error_string +functions will return appropriate values. +.Sh SEE ALSO +.Xr archive_read 3 , +.Xr archive_write 3 , +.Xr tar 1 , +.Xr libarchive 3 +.Sh HISTORY +The +.Nm libarchive +library first appeared in +.Fx 5.3 . +The +.Nm archive_write_disk +interface was added to +.Nm libarchive 2.0 +and first appeared in +.Fx 6.3 . +.Sh AUTHORS +.An -nosplit +The +.Nm libarchive +library was written by +.An Tim Kientzle Aq kientzle@acm.org . +.Sh BUGS +Directories are actually extracted in two distinct phases. +Directories are created during +.Fn archive_write_header , +but final permissions are not set until +.Fn archive_write_close . +This separation is necessary to correctly handle borderline +cases such as a non-writable directory containing +files, but can cause unexpected results. +In particular, directory permissions are not fully +restored until the archive is closed. +If you use +.Xr chdir 2 +to change the current directory between calls to +.Fn archive_read_extract +or before calling +.Fn archive_read_close , +you may confuse the permission-setting logic with +the result that directory permissions are restored +incorrectly. +.Pp +The library attempts to create objects with filenames longer than +.Cm PATH_MAX +by creating prefixes of the full path and changing the current directory. +Currently, this logic is limited in scope; the fixup pass does +not work correctly for such objects and the symlink security check +option disables the support for very long pathnames. +.Pp +Restoring the path +.Pa aa/../bb +does create each intermediate directory. +In particular, the directory +.Pa aa +is created as well as the final object +.Pa bb . +In theory, this can be exploited to create an entire directory heirarchy +with a single request. +Of course, this does not work if the +.Cm ARCHIVE_EXTRACT_NODOTDOT +option is specified. +.Pp +Implicit directories are always created obeying the current umask. +Explicit objects are created obeying the current umask unless +.Cm ARCHIVE_EXTRACT_PERM +is specified, in which case they current umask is ignored. +.Pp +SGID and SUID bits are restored only if the correct user and +group could be set. +If +.Cm ARCHIVE_EXTRACT_OWNER +is not specified, then no attempt is made to set the ownership. +In this case, SGID and SUID bits are restored only if the +user and group of the final object happen to match those specified +in the entry. +.Pp +The +.Dq standard +user-id and group-id lookup functions are not the defaults because +.Xr getgrnam 3 +and +.Xr getpwnam 3 +are sometimes too large for particular applications. +The current design allows the application author to use a more +compact implementation when appropriate. +.Pp +There should be a corresponding +.Nm archive_read_disk +interface that walks a directory heirarchy and returns archive +entry objects. \ No newline at end of file diff --git a/lib/libarchive/archive_write_disk.c b/lib/libarchive/archive_write_disk.c new file mode 100644 index 000000000..20d7b7452 --- /dev/null +++ b/lib/libarchive/archive_write_disk.c @@ -0,0 +1,2638 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_write_disk.c 201159 2009-12-29 05:35:40Z kientzle $"); + +#ifdef HAVE_SYS_TYPES_H +#include +#endif +#ifdef HAVE_SYS_ACL_H +#include +#endif +#ifdef HAVE_SYS_EXTATTR_H +#include +#endif +#ifdef HAVE_SYS_XATTR_H +#include +#endif +#ifdef HAVE_ATTR_XATTR_H +#include +#endif +#ifdef HAVE_SYS_IOCTL_H +#include +#endif +#ifdef HAVE_SYS_STAT_H +#include +#endif +#ifdef HAVE_SYS_TIME_H +#include +#endif +#ifdef HAVE_SYS_UTIME_H +#include +#endif +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_FCNTL_H +#include +#endif +#ifdef HAVE_GRP_H +#include +#endif +#ifdef HAVE_LINUX_FS_H +#include /* for Linux file flags */ +#endif +/* + * Some Linux distributions have both linux/ext2_fs.h and ext2fs/ext2_fs.h. + * As the include guards don't agree, the order of include is important. + */ +#ifdef HAVE_LINUX_EXT2_FS_H +#include /* for Linux file flags */ +#endif +#if defined(HAVE_EXT2FS_EXT2_FS_H) && !defined(__CYGWIN__) +#include /* Linux file flags, broken on Cygwin */ +#endif +#ifdef HAVE_LIMITS_H +#include +#endif +#ifdef HAVE_PWD_H +#include +#endif +#include +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#ifdef HAVE_UNISTD_H +#include +#endif +#ifdef HAVE_UTIME_H +#include +#endif + +#include "archive.h" +#include "archive_string.h" +#include "archive_entry.h" +#include "archive_private.h" + +#ifndef O_BINARY +#define O_BINARY 0 +#endif + +struct fixup_entry { + struct fixup_entry *next; + mode_t mode; +#ifndef __minix + int64_t atime; + int64_t birthtime; + int64_t mtime; +#else + time_t atime; + time_t birthtime; + time_t mtime; +#endif + unsigned long atime_nanos; + unsigned long birthtime_nanos; + unsigned long mtime_nanos; + unsigned long fflags_set; + int fixup; /* bitmask of what needs fixing */ + char *name; +}; + +/* + * We use a bitmask to track which operations remain to be done for + * this file. In particular, this helps us avoid unnecessary + * operations when it's possible to take care of one step as a + * side-effect of another. For example, mkdir() can specify the mode + * for the newly-created object but symlink() cannot. This means we + * can skip chmod() if mkdir() succeeded, but we must explicitly + * chmod() if we're trying to create a directory that already exists + * (mkdir() failed) or if we're restoring a symlink. Similarly, we + * need to verify UID/GID before trying to restore SUID/SGID bits; + * that verification can occur explicitly through a stat() call or + * implicitly because of a successful chown() call. + */ +#define TODO_MODE_FORCE 0x40000000 +#define TODO_MODE_BASE 0x20000000 +#define TODO_SUID 0x10000000 +#define TODO_SUID_CHECK 0x08000000 +#define TODO_SGID 0x04000000 +#define TODO_SGID_CHECK 0x02000000 +#define TODO_MODE (TODO_MODE_BASE|TODO_SUID|TODO_SGID) +#define TODO_TIMES ARCHIVE_EXTRACT_TIME +#define TODO_OWNER ARCHIVE_EXTRACT_OWNER +#define TODO_FFLAGS ARCHIVE_EXTRACT_FFLAGS +#define TODO_ACLS ARCHIVE_EXTRACT_ACL +#define TODO_XATTR ARCHIVE_EXTRACT_XATTR + +struct archive_write_disk { + struct archive archive; + + mode_t user_umask; + struct fixup_entry *fixup_list; + struct fixup_entry *current_fixup; + uid_t user_uid; + dev_t skip_file_dev; + ino_t skip_file_ino; + time_t start_time; + + gid_t (*lookup_gid)(void *private, const char *gname, gid_t gid); + void (*cleanup_gid)(void *private); + void *lookup_gid_data; + uid_t (*lookup_uid)(void *private, const char *uname, uid_t uid); + void (*cleanup_uid)(void *private); + void *lookup_uid_data; + + /* + * Full path of last file to satisfy symlink checks. + */ + struct archive_string path_safe; + + /* + * Cached stat data from disk for the current entry. + * If this is valid, pst points to st. Otherwise, + * pst is null. + */ + struct stat st; + struct stat *pst; + + /* Information about the object being restored right now. */ + struct archive_entry *entry; /* Entry being extracted. */ + char *name; /* Name of entry, possibly edited. */ + struct archive_string _name_data; /* backing store for 'name' */ + /* Tasks remaining for this object. */ + int todo; + /* Tasks deferred until end-of-archive. */ + int deferred; + /* Options requested by the client. */ + int flags; + /* Handle for the file we're restoring. */ + int fd; + /* Current offset for writing data to the file. */ + off_t offset; + /* Last offset actually written to disk. */ + off_t fd_offset; + /* Maximum size of file, -1 if unknown. */ + off_t filesize; + /* Dir we were in before this restore; only for deep paths. */ + int restore_pwd; + /* Mode we should use for this entry; affected by _PERM and umask. */ + mode_t mode; + /* UID/GID to use in restoring this entry. */ + uid_t uid; + gid_t gid; +}; + +/* + * Default mode for dirs created automatically (will be modified by umask). + * Note that POSIX specifies 0777 for implicity-created dirs, "modified + * by the process' file creation mask." + */ +#define DEFAULT_DIR_MODE 0777 +/* + * Dir modes are restored in two steps: During the extraction, the permissions + * in the archive are modified to match the following limits. During + * the post-extract fixup pass, the permissions from the archive are + * applied. + */ +#define MINIMUM_DIR_MODE 0700 +#define MAXIMUM_DIR_MODE 0775 + +static int check_symlinks(struct archive_write_disk *); +static int create_filesystem_object(struct archive_write_disk *); +static struct fixup_entry *current_fixup(struct archive_write_disk *, const char *pathname); +#ifdef HAVE_FCHDIR +static void edit_deep_directories(struct archive_write_disk *ad); +#endif +static int cleanup_pathname(struct archive_write_disk *); +static int create_dir(struct archive_write_disk *, char *); +static int create_parent_dir(struct archive_write_disk *, char *); +static int older(struct stat *, struct archive_entry *); +static int restore_entry(struct archive_write_disk *); +#ifdef HAVE_POSIX_ACL +static int set_acl(struct archive_write_disk *, int fd, struct archive_entry *, + acl_type_t, int archive_entry_acl_type, const char *tn); +#endif +static int set_acls(struct archive_write_disk *); +static int set_xattrs(struct archive_write_disk *); +static int set_fflags(struct archive_write_disk *); +static int set_fflags_platform(struct archive_write_disk *, int fd, + const char *name, mode_t mode, + unsigned long fflags_set, unsigned long fflags_clear); +static int set_ownership(struct archive_write_disk *); +static int set_mode(struct archive_write_disk *, int mode); +static int set_time(int, int, const char *, time_t, long, time_t, long); +static int set_times(struct archive_write_disk *); +static struct fixup_entry *sort_dir_list(struct fixup_entry *p); +static gid_t trivial_lookup_gid(void *, const char *, gid_t); +static uid_t trivial_lookup_uid(void *, const char *, uid_t); +static ssize_t write_data_block(struct archive_write_disk *, + const char *, size_t); + +static struct archive_vtable *archive_write_disk_vtable(void); + +static int _archive_write_close(struct archive *); +static int _archive_write_finish(struct archive *); +static int _archive_write_header(struct archive *, struct archive_entry *); +static int _archive_write_finish_entry(struct archive *); +static ssize_t _archive_write_data(struct archive *, const void *, size_t); +static ssize_t _archive_write_data_block(struct archive *, const void *, size_t, off_t); + +static int +_archive_write_disk_lazy_stat(struct archive_write_disk *a) +{ + if (a->pst != NULL) { + /* Already have stat() data available. */ + return (ARCHIVE_OK); + } +#ifdef HAVE_FSTAT + if (a->fd >= 0 && fstat(a->fd, &a->st) == 0) { + a->pst = &a->st; + return (ARCHIVE_OK); + } +#endif + /* + * XXX At this point, symlinks should not be hit, otherwise + * XXX a race occured. Do we want to check explicitly for that? + */ + if (lstat(a->name, &a->st) == 0) { + a->pst = &a->st; + return (ARCHIVE_OK); + } + archive_set_error(&a->archive, errno, "Couldn't stat file"); + return (ARCHIVE_WARN); +} + +static struct archive_vtable * +archive_write_disk_vtable(void) +{ + static struct archive_vtable av; + static int inited = 0; + + if (!inited) { + av.archive_close = _archive_write_close; + av.archive_finish = _archive_write_finish; + av.archive_write_header = _archive_write_header; + av.archive_write_finish_entry = _archive_write_finish_entry; + av.archive_write_data = _archive_write_data; + av.archive_write_data_block = _archive_write_data_block; + } + return (&av); +} + + +int +archive_write_disk_set_options(struct archive *_a, int flags) +{ + struct archive_write_disk *a = (struct archive_write_disk *)_a; + + a->flags = flags; + return (ARCHIVE_OK); +} + + +/* + * Extract this entry to disk. + * + * TODO: Validate hardlinks. According to the standards, we're + * supposed to check each extracted hardlink and squawk if it refers + * to a file that we didn't restore. I'm not entirely convinced this + * is a good idea, but more importantly: Is there any way to validate + * hardlinks without keeping a complete list of filenames from the + * entire archive?? Ugh. + * + */ +static int +_archive_write_header(struct archive *_a, struct archive_entry *entry) +{ + struct archive_write_disk *a = (struct archive_write_disk *)_a; + struct fixup_entry *fe; + int ret, r; + + __archive_check_magic(&a->archive, ARCHIVE_WRITE_DISK_MAGIC, + ARCHIVE_STATE_HEADER | ARCHIVE_STATE_DATA, + "archive_write_disk_header"); + archive_clear_error(&a->archive); + if (a->archive.state & ARCHIVE_STATE_DATA) { + r = _archive_write_finish_entry(&a->archive); + if (r == ARCHIVE_FATAL) + return (r); + } + + /* Set up for this particular entry. */ + a->pst = NULL; + a->current_fixup = NULL; + a->deferred = 0; + if (a->entry) { + archive_entry_free(a->entry); + a->entry = NULL; + } + a->entry = archive_entry_clone(entry); + a->fd = -1; + a->fd_offset = 0; + a->offset = 0; + a->uid = a->user_uid; + a->mode = archive_entry_mode(a->entry); + if (archive_entry_size_is_set(a->entry)) + a->filesize = archive_entry_size(a->entry); + else + a->filesize = -1; + archive_strcpy(&(a->_name_data), archive_entry_pathname(a->entry)); + a->name = a->_name_data.s; + archive_clear_error(&a->archive); + + /* + * Clean up the requested path. This is necessary for correct + * dir restores; the dir restore logic otherwise gets messed + * up by nonsense like "dir/.". + */ + ret = cleanup_pathname(a); + if (ret != ARCHIVE_OK) + return (ret); + + /* + * Set the umask to zero so we get predictable mode settings. + * This gets done on every call to _write_header in case the + * user edits their umask during the extraction for some + * reason. This will be reset before we return. Note that we + * don't need to do this in _finish_entry, as the chmod(), etc, + * system calls don't obey umask. + */ + a->user_umask = umask(0); + /* From here on, early exit requires "goto done" to clean up. */ + + /* Figure out what we need to do for this entry. */ + a->todo = TODO_MODE_BASE; + if (a->flags & ARCHIVE_EXTRACT_PERM) { + a->todo |= TODO_MODE_FORCE; /* Be pushy about permissions. */ + /* + * SGID requires an extra "check" step because we + * cannot easily predict the GID that the system will + * assign. (Different systems assign GIDs to files + * based on a variety of criteria, including process + * credentials and the gid of the enclosing + * directory.) We can only restore the SGID bit if + * the file has the right GID, and we only know the + * GID if we either set it (see set_ownership) or if + * we've actually called stat() on the file after it + * was restored. Since there are several places at + * which we might verify the GID, we need a TODO bit + * to keep track. + */ + if (a->mode & S_ISGID) + a->todo |= TODO_SGID | TODO_SGID_CHECK; + /* + * Verifying the SUID is simpler, but can still be + * done in multiple ways, hence the separate "check" bit. + */ + if (a->mode & S_ISUID) + a->todo |= TODO_SUID | TODO_SUID_CHECK; + } else { + /* + * User didn't request full permissions, so don't + * restore SUID, SGID bits and obey umask. + */ + a->mode &= ~S_ISUID; + a->mode &= ~S_ISGID; + a->mode &= ~S_ISVTX; + a->mode &= ~a->user_umask; + } +#if !defined(_WIN32) || defined(__CYGWIN__) + if (a->flags & ARCHIVE_EXTRACT_OWNER) + a->todo |= TODO_OWNER; +#endif + if (a->flags & ARCHIVE_EXTRACT_TIME) + a->todo |= TODO_TIMES; + if (a->flags & ARCHIVE_EXTRACT_ACL) + a->todo |= TODO_ACLS; + if (a->flags & ARCHIVE_EXTRACT_XATTR) + a->todo |= TODO_XATTR; + if (a->flags & ARCHIVE_EXTRACT_FFLAGS) + a->todo |= TODO_FFLAGS; + if (a->flags & ARCHIVE_EXTRACT_SECURE_SYMLINKS) { + ret = check_symlinks(a); + if (ret != ARCHIVE_OK) + goto done; + } +#ifdef HAVE_FCHDIR + /* If path exceeds PATH_MAX, shorten the path. */ + edit_deep_directories(a); +#endif + + ret = restore_entry(a); + + /* + * TODO: There are rumours that some extended attributes must + * be restored before file data is written. If this is true, + * then we either need to write all extended attributes both + * before and after restoring the data, or find some rule for + * determining which must go first and which last. Due to the + * many ways people are using xattrs, this may prove to be an + * intractable problem. + */ + +#ifdef HAVE_FCHDIR + /* If we changed directory above, restore it here. */ + if (a->restore_pwd >= 0) { + r = fchdir(a->restore_pwd); + if (r != 0) { + archive_set_error(&a->archive, errno, "chdir() failure"); + ret = ARCHIVE_FATAL; + } + close(a->restore_pwd); + a->restore_pwd = -1; + } +#endif + + /* + * Fixup uses the unedited pathname from archive_entry_pathname(), + * because it is relative to the base dir and the edited path + * might be relative to some intermediate dir as a result of the + * deep restore logic. + */ + if (a->deferred & TODO_MODE) { + fe = current_fixup(a, archive_entry_pathname(entry)); + fe->fixup |= TODO_MODE_BASE; + fe->mode = a->mode; + } + + if ((a->deferred & TODO_TIMES) + && (archive_entry_mtime_is_set(entry) + || archive_entry_atime_is_set(entry))) { + fe = current_fixup(a, archive_entry_pathname(entry)); + fe->fixup |= TODO_TIMES; + if (archive_entry_atime_is_set(entry)) { + fe->atime = archive_entry_atime(entry); + fe->atime_nanos = archive_entry_atime_nsec(entry); + } else { + /* If atime is unset, use start time. */ + fe->atime = a->start_time; + fe->atime_nanos = 0; + } + if (archive_entry_mtime_is_set(entry)) { + fe->mtime = archive_entry_mtime(entry); + fe->mtime_nanos = archive_entry_mtime_nsec(entry); + } else { + /* If mtime is unset, use start time. */ + fe->mtime = a->start_time; + fe->mtime_nanos = 0; + } + if (archive_entry_birthtime_is_set(entry)) { + fe->birthtime = archive_entry_birthtime(entry); + fe->birthtime_nanos = archive_entry_birthtime_nsec(entry); + } else { + /* If birthtime is unset, use mtime. */ + fe->birthtime = fe->mtime; + fe->birthtime_nanos = fe->mtime_nanos; + } + } + + if (a->deferred & TODO_FFLAGS) { + fe = current_fixup(a, archive_entry_pathname(entry)); + fe->fixup |= TODO_FFLAGS; + /* TODO: Complete this.. defer fflags from below. */ + } + + /* We've created the object and are ready to pour data into it. */ + if (ret >= ARCHIVE_WARN) + a->archive.state = ARCHIVE_STATE_DATA; + /* + * If it's not open, tell our client not to try writing. + * In particular, dirs, links, etc, don't get written to. + */ + if (a->fd < 0) { + archive_entry_set_size(entry, 0); + a->filesize = 0; + } +done: + /* Restore the user's umask before returning. */ + umask(a->user_umask); + + return (ret); +} + +int +archive_write_disk_set_skip_file(struct archive *_a, dev_t d, ino_t i) +{ + struct archive_write_disk *a = (struct archive_write_disk *)_a; + __archive_check_magic(&a->archive, ARCHIVE_WRITE_DISK_MAGIC, + ARCHIVE_STATE_ANY, "archive_write_disk_set_skip_file"); + a->skip_file_dev = d; + a->skip_file_ino = i; + return (ARCHIVE_OK); +} + +static ssize_t +write_data_block(struct archive_write_disk *a, const char *buff, size_t size) +{ +#ifndef __minix + uint64_t start_size = size; +#else + size_t start_size = size; +#endif + ssize_t bytes_written = 0; + ssize_t block_size = 0, bytes_to_write; + + if (size == 0) + return (ARCHIVE_OK); + + if (a->filesize == 0 || a->fd < 0) { + archive_set_error(&a->archive, 0, + "Attempt to write to an empty file"); + return (ARCHIVE_WARN); + } + + if (a->flags & ARCHIVE_EXTRACT_SPARSE) { +#if HAVE_STRUCT_STAT_ST_BLKSIZE + int r; + if ((r = _archive_write_disk_lazy_stat(a)) != ARCHIVE_OK) + return (r); + block_size = a->pst->st_blksize; +#else + /* XXX TODO XXX Is there a more appropriate choice here ? */ + /* This needn't match the filesystem allocation size. */ + block_size = 16*1024; +#endif + } + + /* If this write would run beyond the file size, truncate it. */ + if (a->filesize >= 0 && (off_t)(a->offset + size) > a->filesize) + start_size = size = (size_t)(a->filesize - a->offset); + + /* Write the data. */ + while (size > 0) { + if (block_size == 0) { + bytes_to_write = size; + } else { + /* We're sparsifying the file. */ + const char *p, *end; + off_t block_end; + + /* Skip leading zero bytes. */ + for (p = buff, end = buff + size; p < end; ++p) { + if (*p != '\0') + break; + } + a->offset += p - buff; + size -= p - buff; + buff = p; + if (size == 0) + break; + + /* Calculate next block boundary after offset. */ + block_end + = (a->offset / block_size + 1) * block_size; + + /* If the adjusted write would cross block boundary, + * truncate it to the block boundary. */ + bytes_to_write = size; + if (a->offset + bytes_to_write > block_end) + bytes_to_write = block_end - a->offset; + } + /* Seek if necessary to the specified offset. */ + if (a->offset != a->fd_offset) { + if (lseek(a->fd, a->offset, SEEK_SET) < 0) { + archive_set_error(&a->archive, errno, + "Seek failed"); + return (ARCHIVE_FATAL); + } + a->fd_offset = a->offset; + a->archive.file_position = a->offset; + a->archive.raw_position = a->offset; + } + bytes_written = write(a->fd, buff, bytes_to_write); + if (bytes_written < 0) { + archive_set_error(&a->archive, errno, "Write failed"); + return (ARCHIVE_WARN); + } + buff += bytes_written; + size -= bytes_written; + a->offset += bytes_written; + a->archive.file_position += bytes_written; + a->archive.raw_position += bytes_written; + a->fd_offset = a->offset; + } + return (start_size - size); +} + +static ssize_t +_archive_write_data_block(struct archive *_a, + const void *buff, size_t size, off_t offset) +{ + struct archive_write_disk *a = (struct archive_write_disk *)_a; + ssize_t r; + + __archive_check_magic(&a->archive, ARCHIVE_WRITE_DISK_MAGIC, + ARCHIVE_STATE_DATA, "archive_write_disk_block"); + + a->offset = offset; + r = write_data_block(a, buff, size); + if (r < ARCHIVE_OK) + return (r); + if ((size_t)r < size) { + archive_set_error(&a->archive, 0, + "Write request too large"); + return (ARCHIVE_WARN); + } + return (ARCHIVE_OK); +} + +static ssize_t +_archive_write_data(struct archive *_a, const void *buff, size_t size) +{ + struct archive_write_disk *a = (struct archive_write_disk *)_a; + + __archive_check_magic(&a->archive, ARCHIVE_WRITE_DISK_MAGIC, + ARCHIVE_STATE_DATA, "archive_write_data"); + + return (write_data_block(a, buff, size)); +} + +static int +_archive_write_finish_entry(struct archive *_a) +{ + struct archive_write_disk *a = (struct archive_write_disk *)_a; + int ret = ARCHIVE_OK; + + __archive_check_magic(&a->archive, ARCHIVE_WRITE_DISK_MAGIC, + ARCHIVE_STATE_HEADER | ARCHIVE_STATE_DATA, + "archive_write_finish_entry"); + if (a->archive.state & ARCHIVE_STATE_HEADER) + return (ARCHIVE_OK); + archive_clear_error(&a->archive); + + /* Pad or truncate file to the right size. */ + if (a->fd < 0) { + /* There's no file. */ + } else if (a->filesize < 0) { + /* File size is unknown, so we can't set the size. */ + } else if (a->fd_offset == a->filesize) { + /* Last write ended at exactly the filesize; we're done. */ + /* Hopefully, this is the common case. */ + } else { +#if HAVE_FTRUNCATE + if (ftruncate(a->fd, a->filesize) == -1 && + a->filesize == 0) { + archive_set_error(&a->archive, errno, + "File size could not be restored"); + return (ARCHIVE_FAILED); + } +#endif + /* + * Not all platforms implement the XSI option to + * extend files via ftruncate. Stat() the file again + * to see what happened. + */ + a->pst = NULL; + if ((ret = _archive_write_disk_lazy_stat(a)) != ARCHIVE_OK) + return (ret); + /* We can use lseek()/write() to extend the file if + * ftruncate didn't work or isn't available. */ + if (a->st.st_size < a->filesize) { + const char nul = '\0'; + if (lseek(a->fd, a->filesize - 1, SEEK_SET) < 0) { + archive_set_error(&a->archive, errno, + "Seek failed"); + return (ARCHIVE_FATAL); + } + if (write(a->fd, &nul, 1) < 0) { + archive_set_error(&a->archive, errno, + "Write to restore size failed"); + return (ARCHIVE_FATAL); + } + a->pst = NULL; + } + } + + /* Restore metadata. */ + + /* + * Look up the "real" UID only if we're going to need it. + * TODO: the TODO_SGID condition can be dropped here, can't it? + */ + if (a->todo & (TODO_OWNER | TODO_SUID | TODO_SGID)) { + a->uid = a->lookup_uid(a->lookup_uid_data, + archive_entry_uname(a->entry), + archive_entry_uid(a->entry)); + } + /* Look up the "real" GID only if we're going to need it. */ + /* TODO: the TODO_SUID condition can be dropped here, can't it? */ + if (a->todo & (TODO_OWNER | TODO_SGID | TODO_SUID)) { + a->gid = a->lookup_gid(a->lookup_gid_data, + archive_entry_gname(a->entry), + archive_entry_gid(a->entry)); + } + /* + * If restoring ownership, do it before trying to restore suid/sgid + * bits. If we set the owner, we know what it is and can skip + * a stat() call to examine the ownership of the file on disk. + */ + if (a->todo & TODO_OWNER) + ret = set_ownership(a); + if (a->todo & TODO_MODE) { + int r2 = set_mode(a, a->mode); + if (r2 < ret) ret = r2; + } + if (a->todo & TODO_ACLS) { + int r2 = set_acls(a); + if (r2 < ret) ret = r2; + } + + /* + * Security-related extended attributes (such as + * security.capability on Linux) have to be restored last, + * since they're implicitly removed by other file changes. + */ + if (a->todo & TODO_XATTR) { + int r2 = set_xattrs(a); + if (r2 < ret) ret = r2; + } + + /* + * Some flags prevent file modification; they must be restored after + * file contents are written. + */ + if (a->todo & TODO_FFLAGS) { + int r2 = set_fflags(a); + if (r2 < ret) ret = r2; + } + /* + * Time has to be restored after all other metadata; + * otherwise atime will get changed. + */ + if (a->todo & TODO_TIMES) { + int r2 = set_times(a); + if (r2 < ret) ret = r2; + } + + /* If there's an fd, we can close it now. */ + if (a->fd >= 0) { + close(a->fd); + a->fd = -1; + } + /* If there's an entry, we can release it now. */ + if (a->entry) { + archive_entry_free(a->entry); + a->entry = NULL; + } + a->archive.state = ARCHIVE_STATE_HEADER; + return (ret); +} + +int +archive_write_disk_set_group_lookup(struct archive *_a, + void *private_data, + gid_t (*lookup_gid)(void *private, const char *gname, gid_t gid), + void (*cleanup_gid)(void *private)) +{ + struct archive_write_disk *a = (struct archive_write_disk *)_a; + __archive_check_magic(&a->archive, ARCHIVE_WRITE_DISK_MAGIC, + ARCHIVE_STATE_ANY, "archive_write_disk_set_group_lookup"); + + a->lookup_gid = lookup_gid; + a->cleanup_gid = cleanup_gid; + a->lookup_gid_data = private_data; + return (ARCHIVE_OK); +} + +int +archive_write_disk_set_user_lookup(struct archive *_a, + void *private_data, + uid_t (*lookup_uid)(void *private, const char *uname, uid_t uid), + void (*cleanup_uid)(void *private)) +{ + struct archive_write_disk *a = (struct archive_write_disk *)_a; + __archive_check_magic(&a->archive, ARCHIVE_WRITE_DISK_MAGIC, + ARCHIVE_STATE_ANY, "archive_write_disk_set_user_lookup"); + + a->lookup_uid = lookup_uid; + a->cleanup_uid = cleanup_uid; + a->lookup_uid_data = private_data; + return (ARCHIVE_OK); +} + + +/* + * Create a new archive_write_disk object and initialize it with global state. + */ +struct archive * +archive_write_disk_new(void) +{ + struct archive_write_disk *a; + + a = (struct archive_write_disk *)malloc(sizeof(*a)); + if (a == NULL) + return (NULL); + memset(a, 0, sizeof(*a)); + a->archive.magic = ARCHIVE_WRITE_DISK_MAGIC; + /* We're ready to write a header immediately. */ + a->archive.state = ARCHIVE_STATE_HEADER; + a->archive.vtable = archive_write_disk_vtable(); + a->lookup_uid = trivial_lookup_uid; + a->lookup_gid = trivial_lookup_gid; + a->start_time = time(NULL); +#ifdef HAVE_GETEUID + a->user_uid = geteuid(); +#endif /* HAVE_GETEUID */ + if (archive_string_ensure(&a->path_safe, 512) == NULL) { + free(a); + return (NULL); + } + return (&a->archive); +} + + +/* + * If pathname is longer than PATH_MAX, chdir to a suitable + * intermediate dir and edit the path down to a shorter suffix. Note + * that this routine never returns an error; if the chdir() attempt + * fails for any reason, we just go ahead with the long pathname. The + * object creation is likely to fail, but any error will get handled + * at that time. + */ +#ifdef HAVE_FCHDIR +static void +edit_deep_directories(struct archive_write_disk *a) +{ + int ret; + char *tail = a->name; + + a->restore_pwd = -1; + + /* If path is short, avoid the open() below. */ + if (strlen(tail) <= PATH_MAX) + return; + + /* Try to record our starting dir. */ + a->restore_pwd = open(".", O_RDONLY | O_BINARY); + if (a->restore_pwd < 0) + return; + + /* As long as the path is too long... */ + while (strlen(tail) > PATH_MAX) { + /* Locate a dir prefix shorter than PATH_MAX. */ + tail += PATH_MAX - 8; + while (tail > a->name && *tail != '/') + tail--; + /* Exit if we find a too-long path component. */ + if (tail <= a->name) + return; + /* Create the intermediate dir and chdir to it. */ + *tail = '\0'; /* Terminate dir portion */ + ret = create_dir(a, a->name); + if (ret == ARCHIVE_OK && chdir(a->name) != 0) + ret = ARCHIVE_FAILED; + *tail = '/'; /* Restore the / we removed. */ + if (ret != ARCHIVE_OK) + return; + tail++; + /* The chdir() succeeded; we've now shortened the path. */ + a->name = tail; + } + return; +} +#endif + +/* + * The main restore function. + */ +static int +restore_entry(struct archive_write_disk *a) +{ + int ret = ARCHIVE_OK, en; + + if (a->flags & ARCHIVE_EXTRACT_UNLINK && !S_ISDIR(a->mode)) { + /* + * TODO: Fix this. Apparently, there are platforms + * that still allow root to hose the entire filesystem + * by unlinking a dir. The S_ISDIR() test above + * prevents us from using unlink() here if the new + * object is a dir, but that doesn't mean the old + * object isn't a dir. + */ + if (unlink(a->name) == 0) { + /* We removed it, reset cached stat. */ + a->pst = NULL; + } else if (errno == ENOENT) { + /* File didn't exist, that's just as good. */ + } else if (rmdir(a->name) == 0) { + /* It was a dir, but now it's gone. */ + a->pst = NULL; + } else { + /* We tried, but couldn't get rid of it. */ + archive_set_error(&a->archive, errno, + "Could not unlink"); + return(ARCHIVE_FAILED); + } + } + + /* Try creating it first; if this fails, we'll try to recover. */ + en = create_filesystem_object(a); + + if ((en == ENOTDIR || en == ENOENT) + && !(a->flags & ARCHIVE_EXTRACT_NO_AUTODIR)) { + /* If the parent dir doesn't exist, try creating it. */ + create_parent_dir(a, a->name); + /* Now try to create the object again. */ + en = create_filesystem_object(a); + } + + if ((en == EISDIR || en == EEXIST) + && (a->flags & ARCHIVE_EXTRACT_NO_OVERWRITE)) { + /* If we're not overwriting, we're done. */ + archive_set_error(&a->archive, en, "Already exists"); + return (ARCHIVE_FAILED); + } + + /* + * Some platforms return EISDIR if you call + * open(O_WRONLY | O_EXCL | O_CREAT) on a directory, some + * return EEXIST. POSIX is ambiguous, requiring EISDIR + * for open(O_WRONLY) on a dir and EEXIST for open(O_EXCL | O_CREAT) + * on an existing item. + */ + if (en == EISDIR) { + /* A dir is in the way of a non-dir, rmdir it. */ + if (rmdir(a->name) != 0) { + archive_set_error(&a->archive, errno, + "Can't remove already-existing dir"); + return (ARCHIVE_FAILED); + } + a->pst = NULL; + /* Try again. */ + en = create_filesystem_object(a); + } else if (en == EEXIST) { + /* + * We know something is in the way, but we don't know what; + * we need to find out before we go any further. + */ + int r = 0; + /* + * The SECURE_SYMLINK logic has already removed a + * symlink to a dir if the client wants that. So + * follow the symlink if we're creating a dir. + */ + if (S_ISDIR(a->mode)) + r = stat(a->name, &a->st); + /* + * If it's not a dir (or it's a broken symlink), + * then don't follow it. + */ + if (r != 0 || !S_ISDIR(a->mode)) + r = lstat(a->name, &a->st); + if (r != 0) { + archive_set_error(&a->archive, errno, + "Can't stat existing object"); + return (ARCHIVE_FAILED); + } + + /* + * NO_OVERWRITE_NEWER doesn't apply to directories. + */ + if ((a->flags & ARCHIVE_EXTRACT_NO_OVERWRITE_NEWER) + && !S_ISDIR(a->st.st_mode)) { + if (!older(&(a->st), a->entry)) { + archive_set_error(&a->archive, 0, + "File on disk is not older; skipping."); + return (ARCHIVE_FAILED); + } + } + + /* If it's our archive, we're done. */ + if (a->skip_file_dev > 0 && + a->skip_file_ino > 0 && + a->st.st_dev == a->skip_file_dev && + a->st.st_ino == a->skip_file_ino) { + archive_set_error(&a->archive, 0, "Refusing to overwrite archive"); + return (ARCHIVE_FAILED); + } + + if (!S_ISDIR(a->st.st_mode)) { + /* A non-dir is in the way, unlink it. */ + if (unlink(a->name) != 0) { + archive_set_error(&a->archive, errno, + "Can't unlink already-existing object"); + return (ARCHIVE_FAILED); + } + a->pst = NULL; + /* Try again. */ + en = create_filesystem_object(a); + } else if (!S_ISDIR(a->mode)) { + /* A dir is in the way of a non-dir, rmdir it. */ + if (rmdir(a->name) != 0) { + archive_set_error(&a->archive, errno, + "Can't remove already-existing dir"); + return (ARCHIVE_FAILED); + } + /* Try again. */ + en = create_filesystem_object(a); + } else { + /* + * There's a dir in the way of a dir. Don't + * waste time with rmdir()/mkdir(), just fix + * up the permissions on the existing dir. + * Note that we don't change perms on existing + * dirs unless _EXTRACT_PERM is specified. + */ + if ((a->mode != a->st.st_mode) + && (a->todo & TODO_MODE_FORCE)) + a->deferred |= (a->todo & TODO_MODE); + /* Ownership doesn't need deferred fixup. */ + en = 0; /* Forget the EEXIST. */ + } + } + + if (en) { + /* Everything failed; give up here. */ + archive_set_error(&a->archive, en, "Can't create '%s'", + a->name); + return (ARCHIVE_FAILED); + } + + a->pst = NULL; /* Cached stat data no longer valid. */ + return (ret); +} + +/* + * Returns 0 if creation succeeds, or else returns errno value from + * the failed system call. Note: This function should only ever perform + * a single system call. + */ +static int +create_filesystem_object(struct archive_write_disk *a) +{ + /* Create the entry. */ + const char *linkname; + mode_t final_mode, mode; + int r; + + /* We identify hard/symlinks according to the link names. */ + /* Since link(2) and symlink(2) don't handle modes, we're done here. */ + linkname = archive_entry_hardlink(a->entry); + if (linkname != NULL) { +#if !HAVE_LINK + return (EPERM); +#else + r = link(linkname, a->name) ? errno : 0; + /* + * New cpio and pax formats allow hardlink entries + * to carry data, so we may have to open the file + * for hardlink entries. + * + * If the hardlink was successfully created and + * the archive doesn't have carry data for it, + * consider it to be non-authoritive for meta data. + * This is consistent with GNU tar and BSD pax. + * If the hardlink does carry data, let the last + * archive entry decide ownership. + */ + if (r == 0 && a->filesize <= 0) { + a->todo = 0; + a->deferred = 0; + } if (r == 0 && a->filesize > 0) { + a->fd = open(a->name, O_WRONLY | O_TRUNC | O_BINARY); + if (a->fd < 0) + r = errno; + } + return (r); +#endif + } + linkname = archive_entry_symlink(a->entry); + if (linkname != NULL) { +#if HAVE_SYMLINK + return symlink(linkname, a->name) ? errno : 0; +#else + return (EPERM); +#endif + } + + /* + * The remaining system calls all set permissions, so let's + * try to take advantage of that to avoid an extra chmod() + * call. (Recall that umask is set to zero right now!) + */ + + /* Mode we want for the final restored object (w/o file type bits). */ + final_mode = a->mode & 07777; + /* + * The mode that will actually be restored in this step. Note + * that SUID, SGID, etc, require additional work to ensure + * security, so we never restore them at this point. + */ + mode = final_mode & 0777; + + switch (a->mode & AE_IFMT) { + default: + /* POSIX requires that we fall through here. */ + /* FALLTHROUGH */ + case AE_IFREG: + a->fd = open(a->name, + O_WRONLY | O_CREAT | O_EXCL | O_BINARY, mode); + r = (a->fd < 0); + break; + case AE_IFCHR: +#ifdef HAVE_MKNOD + /* Note: we use AE_IFCHR for the case label, and + * S_IFCHR for the mknod() call. This is correct. */ + r = mknod(a->name, mode | S_IFCHR, + archive_entry_rdev(a->entry)); + break; +#else + /* TODO: Find a better way to warn about our inability + * to restore a char device node. */ + return (EINVAL); +#endif /* HAVE_MKNOD */ + case AE_IFBLK: +#ifdef HAVE_MKNOD + r = mknod(a->name, mode | S_IFBLK, + archive_entry_rdev(a->entry)); + break; +#else + /* TODO: Find a better way to warn about our inability + * to restore a block device node. */ + return (EINVAL); +#endif /* HAVE_MKNOD */ + case AE_IFDIR: + mode = (mode | MINIMUM_DIR_MODE) & MAXIMUM_DIR_MODE; + r = mkdir(a->name, mode); + if (r == 0) { + /* Defer setting dir times. */ + a->deferred |= (a->todo & TODO_TIMES); + a->todo &= ~TODO_TIMES; + /* Never use an immediate chmod(). */ + /* We can't avoid the chmod() entirely if EXTRACT_PERM + * because of SysV SGID inheritance. */ + if ((mode != final_mode) + || (a->flags & ARCHIVE_EXTRACT_PERM)) + a->deferred |= (a->todo & TODO_MODE); + a->todo &= ~TODO_MODE; + } + break; + case AE_IFIFO: +#ifdef HAVE_MKFIFO + r = mkfifo(a->name, mode); + break; +#else + /* TODO: Find a better way to warn about our inability + * to restore a fifo. */ + return (EINVAL); +#endif /* HAVE_MKFIFO */ + } + + /* All the system calls above set errno on failure. */ + if (r) + return (errno); + + /* If we managed to set the final mode, we've avoided a chmod(). */ + if (mode == final_mode) + a->todo &= ~TODO_MODE; + return (0); +} + +/* + * Cleanup function for archive_extract. Mostly, this involves processing + * the fixup list, which is used to address a number of problems: + * * Dir permissions might prevent us from restoring a file in that + * dir, so we restore the dir with minimum 0700 permissions first, + * then correct the mode at the end. + * * Similarly, the act of restoring a file touches the directory + * and changes the timestamp on the dir, so we have to touch-up dir + * timestamps at the end as well. + * * Some file flags can interfere with the restore by, for example, + * preventing the creation of hardlinks to those files. + * + * Note that tar/cpio do not require that archives be in a particular + * order; there is no way to know when the last file has been restored + * within a directory, so there's no way to optimize the memory usage + * here by fixing up the directory any earlier than the + * end-of-archive. + * + * XXX TODO: Directory ACLs should be restored here, for the same + * reason we set directory perms here. XXX + */ +static int +_archive_write_close(struct archive *_a) +{ + struct archive_write_disk *a = (struct archive_write_disk *)_a; + struct fixup_entry *next, *p; + int ret; + + __archive_check_magic(&a->archive, ARCHIVE_WRITE_DISK_MAGIC, + ARCHIVE_STATE_HEADER | ARCHIVE_STATE_DATA, + "archive_write_disk_close"); + ret = _archive_write_finish_entry(&a->archive); + + /* Sort dir list so directories are fixed up in depth-first order. */ + p = sort_dir_list(a->fixup_list); + + while (p != NULL) { + a->pst = NULL; /* Mark stat cache as out-of-date. */ + if (p->fixup & TODO_TIMES) { +#ifdef HAVE_UTIMES + /* {f,l,}utimes() are preferred, when available. */ +#if defined(_WIN32) && !defined(__CYGWIN__) + struct __timeval times[2]; +#else + struct timeval times[2]; +#endif + times[0].tv_sec = p->atime; + times[0].tv_usec = p->atime_nanos / 1000; +#ifdef HAVE_STRUCT_STAT_ST_BIRTHTIME + /* if it's valid and not mtime, push the birthtime first */ + if (((times[1].tv_sec = p->birthtime) < p->mtime) && + (p->birthtime > 0)) + { + times[1].tv_usec = p->birthtime_nanos / 1000; + utimes(p->name, times); + } +#endif + times[1].tv_sec = p->mtime; + times[1].tv_usec = p->mtime_nanos / 1000; +#ifdef HAVE_LUTIMES + lutimes(p->name, times); +#else + utimes(p->name, times); +#endif +#else + /* utime() is more portable, but less precise. */ + struct utimbuf times; + times.modtime = p->mtime; + times.actime = p->atime; + + utime(p->name, ×); +#endif + } + if (p->fixup & TODO_MODE_BASE) + chmod(p->name, p->mode); + + if (p->fixup & TODO_FFLAGS) + set_fflags_platform(a, -1, p->name, + p->mode, p->fflags_set, 0); + + next = p->next; + free(p->name); + free(p); + p = next; + } + a->fixup_list = NULL; + return (ret); +} + +static int +_archive_write_finish(struct archive *_a) +{ + struct archive_write_disk *a = (struct archive_write_disk *)_a; + int ret; + ret = _archive_write_close(&a->archive); + if (a->cleanup_gid != NULL && a->lookup_gid_data != NULL) + (a->cleanup_gid)(a->lookup_gid_data); + if (a->cleanup_uid != NULL && a->lookup_uid_data != NULL) + (a->cleanup_uid)(a->lookup_uid_data); + if (a->entry) + archive_entry_free(a->entry); + archive_string_free(&a->_name_data); + archive_string_free(&a->archive.error_string); + archive_string_free(&a->path_safe); + free(a); + return (ret); +} + +/* + * Simple O(n log n) merge sort to order the fixup list. In + * particular, we want to restore dir timestamps depth-first. + */ +static struct fixup_entry * +sort_dir_list(struct fixup_entry *p) +{ + struct fixup_entry *a, *b, *t; + + if (p == NULL) + return (NULL); + /* A one-item list is already sorted. */ + if (p->next == NULL) + return (p); + + /* Step 1: split the list. */ + t = p; + a = p->next->next; + while (a != NULL) { + /* Step a twice, t once. */ + a = a->next; + if (a != NULL) + a = a->next; + t = t->next; + } + /* Now, t is at the mid-point, so break the list here. */ + b = t->next; + t->next = NULL; + a = p; + + /* Step 2: Recursively sort the two sub-lists. */ + a = sort_dir_list(a); + b = sort_dir_list(b); + + /* Step 3: Merge the returned lists. */ + /* Pick the first element for the merged list. */ + if (strcmp(a->name, b->name) > 0) { + t = p = a; + a = a->next; + } else { + t = p = b; + b = b->next; + } + + /* Always put the later element on the list first. */ + while (a != NULL && b != NULL) { + if (strcmp(a->name, b->name) > 0) { + t->next = a; + a = a->next; + } else { + t->next = b; + b = b->next; + } + t = t->next; + } + + /* Only one list is non-empty, so just splice it on. */ + if (a != NULL) + t->next = a; + if (b != NULL) + t->next = b; + + return (p); +} + +/* + * Returns a new, initialized fixup entry. + * + * TODO: Reduce the memory requirements for this list by using a tree + * structure rather than a simple list of names. + */ +static struct fixup_entry * +new_fixup(struct archive_write_disk *a, const char *pathname) +{ + struct fixup_entry *fe; + + fe = (struct fixup_entry *)malloc(sizeof(struct fixup_entry)); + if (fe == NULL) + return (NULL); + fe->next = a->fixup_list; + a->fixup_list = fe; + fe->fixup = 0; + fe->name = strdup(pathname); + return (fe); +} + +/* + * Returns a fixup structure for the current entry. + */ +static struct fixup_entry * +current_fixup(struct archive_write_disk *a, const char *pathname) +{ + if (a->current_fixup == NULL) + a->current_fixup = new_fixup(a, pathname); + return (a->current_fixup); +} + +/* TODO: Make this work. */ +/* + * TODO: The deep-directory support bypasses this; disable deep directory + * support if we're doing symlink checks. + */ +/* + * TODO: Someday, integrate this with the deep dir support; they both + * scan the path and both can be optimized by comparing against other + * recent paths. + */ +/* TODO: Extend this to support symlinks on Windows Vista and later. */ +static int +check_symlinks(struct archive_write_disk *a) +{ +#if !defined(HAVE_LSTAT) + /* Platform doesn't have lstat, so we can't look for symlinks. */ + (void)a; /* UNUSED */ + return (ARCHIVE_OK); +#else + char *pn, *p; + char c; + int r; + struct stat st; + + /* + * Guard against symlink tricks. Reject any archive entry whose + * destination would be altered by a symlink. + */ + /* Whatever we checked last time doesn't need to be re-checked. */ + pn = a->name; + p = a->path_safe.s; + while ((*pn != '\0') && (*p == *pn)) + ++p, ++pn; + c = pn[0]; + /* Keep going until we've checked the entire name. */ + while (pn[0] != '\0' && (pn[0] != '/' || pn[1] != '\0')) { + /* Skip the next path element. */ + while (*pn != '\0' && *pn != '/') + ++pn; + c = pn[0]; + pn[0] = '\0'; + /* Check that we haven't hit a symlink. */ + r = lstat(a->name, &st); + if (r != 0) { + /* We've hit a dir that doesn't exist; stop now. */ + if (errno == ENOENT) + break; + } else if (S_ISLNK(st.st_mode)) { + if (c == '\0') { + /* + * Last element is symlink; remove it + * so we can overwrite it with the + * item being extracted. + */ + if (unlink(a->name)) { + archive_set_error(&a->archive, errno, + "Could not remove symlink %s", + a->name); + pn[0] = c; + return (ARCHIVE_FAILED); + } + a->pst = NULL; + /* + * Even if we did remove it, a warning + * is in order. The warning is silly, + * though, if we're just replacing one + * symlink with another symlink. + */ + if (!S_ISLNK(a->mode)) { + archive_set_error(&a->archive, 0, + "Removing symlink %s", + a->name); + } + /* Symlink gone. No more problem! */ + pn[0] = c; + return (0); + } else if (a->flags & ARCHIVE_EXTRACT_UNLINK) { + /* User asked us to remove problems. */ + if (unlink(a->name) != 0) { + archive_set_error(&a->archive, 0, + "Cannot remove intervening symlink %s", + a->name); + pn[0] = c; + return (ARCHIVE_FAILED); + } + a->pst = NULL; + } else { + archive_set_error(&a->archive, 0, + "Cannot extract through symlink %s", + a->name); + pn[0] = c; + return (ARCHIVE_FAILED); + } + } + } + pn[0] = c; + /* We've checked and/or cleaned the whole path, so remember it. */ + archive_strcpy(&a->path_safe, a->name); + return (ARCHIVE_OK); +#endif +} + +#if defined(_WIN32) || defined(__CYGWIN__) +/* + * 1. Convert a path separator from '\' to '/' . + * We shouldn't check multi-byte character directly because some + * character-set have been using the '\' character for a part of + * its multibyte character code. + * 2. Replace unusable characters in Windows with underscore('_'). + * See also : http://msdn.microsoft.com/en-us/library/aa365247.aspx + */ +static void +cleanup_pathname_win(struct archive_write_disk *a) +{ + wchar_t wc; + char *p; + size_t alen, l; + + alen = 0; + l = 0; + for (p = a->name; *p != '\0'; p++) { + ++alen; + if (*p == '\\') + l = 1; + /* Rewrite the path name if its character is a unusable. */ + if (*p == ':' || *p == '*' || *p == '?' || *p == '"' || + *p == '<' || *p == '>' || *p == '|') + *p = '_'; + } + if (alen == 0 || l == 0) + return; + /* + * Convert path separator. + */ + p = a->name; + while (*p != '\0' && alen) { + l = mbtowc(&wc, p, alen); + if (l == -1) { + while (*p != '\0') { + if (*p == '\\') + *p = '/'; + ++p; + } + break; + } + if (l == 1 && wc == L'\\') + *p = '/'; + p += l; + alen -= l; + } +} +#endif + +/* + * Canonicalize the pathname. In particular, this strips duplicate + * '/' characters, '.' elements, and trailing '/'. It also raises an + * error for an empty path, a trailing '..' or (if _SECURE_NODOTDOT is + * set) any '..' in the path. + */ +static int +cleanup_pathname(struct archive_write_disk *a) +{ + char *dest, *src; + char separator = '\0'; + + dest = src = a->name; + if (*src == '\0') { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Invalid empty pathname"); + return (ARCHIVE_FAILED); + } + +#if defined(_WIN32) || defined(__CYGWIN__) + cleanup_pathname_win(a); +#endif + /* Skip leading '/'. */ + if (*src == '/') + separator = *src++; + + /* Scan the pathname one element at a time. */ + for (;;) { + /* src points to first char after '/' */ + if (src[0] == '\0') { + break; + } else if (src[0] == '/') { + /* Found '//', ignore second one. */ + src++; + continue; + } else if (src[0] == '.') { + if (src[1] == '\0') { + /* Ignore trailing '.' */ + break; + } else if (src[1] == '/') { + /* Skip './'. */ + src += 2; + continue; + } else if (src[1] == '.') { + if (src[2] == '/' || src[2] == '\0') { + /* Conditionally warn about '..' */ + if (a->flags & ARCHIVE_EXTRACT_SECURE_NODOTDOT) { + archive_set_error(&a->archive, + ARCHIVE_ERRNO_MISC, + "Path contains '..'"); + return (ARCHIVE_FAILED); + } + } + /* + * Note: Under no circumstances do we + * remove '..' elements. In + * particular, restoring + * '/foo/../bar/' should create the + * 'foo' dir as a side-effect. + */ + } + } + + /* Copy current element, including leading '/'. */ + if (separator) + *dest++ = '/'; + while (*src != '\0' && *src != '/') { + *dest++ = *src++; + } + + if (*src == '\0') + break; + + /* Skip '/' separator. */ + separator = *src++; + } + /* + * We've just copied zero or more path elements, not including the + * final '/'. + */ + if (dest == a->name) { + /* + * Nothing got copied. The path must have been something + * like '.' or '/' or './' or '/././././/./'. + */ + if (separator) + *dest++ = '/'; + else + *dest++ = '.'; + } + /* Terminate the result. */ + *dest = '\0'; + return (ARCHIVE_OK); +} + +/* + * Create the parent directory of the specified path, assuming path + * is already in mutable storage. + */ +static int +create_parent_dir(struct archive_write_disk *a, char *path) +{ + char *slash; + int r; + + /* Remove tail element to obtain parent name. */ + slash = strrchr(path, '/'); + if (slash == NULL) + return (ARCHIVE_OK); + *slash = '\0'; + r = create_dir(a, path); + *slash = '/'; + return (r); +} + +/* + * Create the specified dir, recursing to create parents as necessary. + * + * Returns ARCHIVE_OK if the path exists when we're done here. + * Otherwise, returns ARCHIVE_FAILED. + * Assumes path is in mutable storage; path is unchanged on exit. + */ +static int +create_dir(struct archive_write_disk *a, char *path) +{ + struct stat st; + struct fixup_entry *le; + char *slash, *base; + mode_t mode_final, mode; + int r; + + /* Check for special names and just skip them. */ + slash = strrchr(path, '/'); + if (slash == NULL) + base = path; + else + base = slash + 1; + + if (base[0] == '\0' || + (base[0] == '.' && base[1] == '\0') || + (base[0] == '.' && base[1] == '.' && base[2] == '\0')) { + /* Don't bother trying to create null path, '.', or '..'. */ + if (slash != NULL) { + *slash = '\0'; + r = create_dir(a, path); + *slash = '/'; + return (r); + } + return (ARCHIVE_OK); + } + + /* + * Yes, this should be stat() and not lstat(). Using lstat() + * here loses the ability to extract through symlinks. Also note + * that this should not use the a->st cache. + */ + if (stat(path, &st) == 0) { + if (S_ISDIR(st.st_mode)) + return (ARCHIVE_OK); + if ((a->flags & ARCHIVE_EXTRACT_NO_OVERWRITE)) { + archive_set_error(&a->archive, EEXIST, + "Can't create directory '%s'", path); + return (ARCHIVE_FAILED); + } + if (unlink(path) != 0) { + archive_set_error(&a->archive, errno, + "Can't create directory '%s': " + "Conflicting file cannot be removed"); + return (ARCHIVE_FAILED); + } + } else if (errno != ENOENT && errno != ENOTDIR) { + /* Stat failed? */ + archive_set_error(&a->archive, errno, "Can't test directory '%s'", path); + return (ARCHIVE_FAILED); + } else if (slash != NULL) { + *slash = '\0'; + r = create_dir(a, path); + *slash = '/'; + if (r != ARCHIVE_OK) + return (r); + } + + /* + * Mode we want for the final restored directory. Per POSIX, + * implicitly-created dirs must be created obeying the umask. + * There's no mention whether this is different for privileged + * restores (which the rest of this code handles by pretending + * umask=0). I've chosen here to always obey the user's umask for + * implicit dirs, even if _EXTRACT_PERM was specified. + */ + mode_final = DEFAULT_DIR_MODE & ~a->user_umask; + /* Mode we want on disk during the restore process. */ + mode = mode_final; + mode |= MINIMUM_DIR_MODE; + mode &= MAXIMUM_DIR_MODE; + if (mkdir(path, mode) == 0) { + if (mode != mode_final) { + le = new_fixup(a, path); + le->fixup |=TODO_MODE_BASE; + le->mode = mode_final; + } + return (ARCHIVE_OK); + } + + /* + * Without the following check, a/b/../b/c/d fails at the + * second visit to 'b', so 'd' can't be created. Note that we + * don't add it to the fixup list here, as it's already been + * added. + */ + if (stat(path, &st) == 0 && S_ISDIR(st.st_mode)) + return (ARCHIVE_OK); + + archive_set_error(&a->archive, errno, "Failed to create dir '%s'", + path); + return (ARCHIVE_FAILED); +} + +/* + * Note: Although we can skip setting the user id if the desired user + * id matches the current user, we cannot skip setting the group, as + * many systems set the gid based on the containing directory. So + * we have to perform a chown syscall if we want to set the SGID + * bit. (The alternative is to stat() and then possibly chown(); it's + * more efficient to skip the stat() and just always chown().) Note + * that a successful chown() here clears the TODO_SGID_CHECK bit, which + * allows set_mode to skip the stat() check for the GID. + */ +static int +set_ownership(struct archive_write_disk *a) +{ +#ifndef __CYGWIN__ +/* unfortunately, on win32 there is no 'root' user with uid 0, + so we just have to try the chown and see if it works */ + + /* If we know we can't change it, don't bother trying. */ + if (a->user_uid != 0 && a->user_uid != a->uid) { + archive_set_error(&a->archive, errno, + "Can't set UID=%d", a->uid); + return (ARCHIVE_WARN); + } +#endif + +#ifdef HAVE_FCHOWN + /* If we have an fd, we can avoid a race. */ + if (a->fd >= 0 && fchown(a->fd, a->uid, a->gid) == 0) { + /* We've set owner and know uid/gid are correct. */ + a->todo &= ~(TODO_OWNER | TODO_SGID_CHECK | TODO_SUID_CHECK); + return (ARCHIVE_OK); + } +#endif + + /* We prefer lchown() but will use chown() if that's all we have. */ + /* Of course, if we have neither, this will always fail. */ +#ifdef HAVE_LCHOWN + if (lchown(a->name, a->uid, a->gid) == 0) { + /* We've set owner and know uid/gid are correct. */ + a->todo &= ~(TODO_OWNER | TODO_SGID_CHECK | TODO_SUID_CHECK); + return (ARCHIVE_OK); + } +#elif HAVE_CHOWN + if (!S_ISLNK(a->mode) && chown(a->name, a->uid, a->gid) == 0) { + /* We've set owner and know uid/gid are correct. */ + a->todo &= ~(TODO_OWNER | TODO_SGID_CHECK | TODO_SUID_CHECK); + return (ARCHIVE_OK); + } +#endif + + archive_set_error(&a->archive, errno, + "Can't set user=%d/group=%d for %s", a->uid, a->gid, + a->name); + return (ARCHIVE_WARN); +} + + +#if defined(HAVE_UTIMENSAT) && defined(HAVE_FUTIMENS) +/* + * utimensat() and futimens() are defined in POSIX.1-2008. They provide ns + * resolution and setting times on fd and on symlinks, too. + */ +static int +set_time(int fd, int mode, const char *name, + time_t atime, long atime_nsec, + time_t mtime, long mtime_nsec) +{ + struct timespec ts[2]; + ts[0].tv_sec = atime; + ts[0].tv_nsec = atime_nsec; + ts[1].tv_sec = mtime; + ts[1].tv_nsec = mtime_nsec; + if (fd >= 0) + return futimens(fd, ts); + return utimensat(AT_FDCWD, name, ts, AT_SYMLINK_NOFOLLOW); +} +#elif HAVE_UTIMES +/* + * The utimes()-family functions provide µs-resolution and + * a way to set time on an fd or a symlink. We prefer them + * when they're available and utimensat/futimens aren't there. + */ +static int +set_time(int fd, int mode, const char *name, + time_t atime, long atime_nsec, + time_t mtime, long mtime_nsec) +{ +#if defined(_WIN32) && !defined(__CYGWIN__) + struct __timeval times[2]; +#else + struct timeval times[2]; +#endif + + times[0].tv_sec = atime; + times[0].tv_usec = atime_nsec / 1000; + times[1].tv_sec = mtime; + times[1].tv_usec = mtime_nsec / 1000; + +#ifdef HAVE_FUTIMES + if (fd >= 0) + return (futimes(fd, times)); +#else + (void)fd; /* UNUSED */ +#endif +#ifdef HAVE_LUTIMES + (void)mode; /* UNUSED */ + return (lutimes(name, times)); +#else + if (S_ISLNK(mode)) + return (0); + return (utimes(name, times)); +#endif +} +#elif defined(HAVE_UTIME) +/* + * utime() is an older, more standard interface that we'll use + * if utimes() isn't available. + */ +static int +set_time(int fd, int mode, const char *name, + time_t atime, long atime_nsec, + time_t mtime, long mtime_nsec) +{ + struct utimbuf times; + (void)fd; /* UNUSED */ + (void)name; /* UNUSED */ + (void)atime_nsec; /* UNUSED */ + (void)mtime_nsec; /* UNUSED */ + times.actime = atime; + times.modtime = mtime; + if (S_ISLNK(mode)) + return (ARCHIVE_OK); + return (utime(name, ×)); +} +#else +static int +set_time(int fd, int mode, const char *name, + time_t atime, long atime_nsec, + time_t mtime, long mtime_nsec) +{ + return (ARCHIVE_WARN); +} +#endif + +static int +set_times(struct archive_write_disk *a) +{ + time_t atime = a->start_time, mtime = a->start_time; + long atime_nsec = 0, mtime_nsec = 0; + + /* If no time was provided, we're done. */ + if (!archive_entry_atime_is_set(a->entry) +#if HAVE_STRUCT_STAT_ST_BIRTHTIME + && !archive_entry_birthtime_is_set(a->entry) +#endif + && !archive_entry_mtime_is_set(a->entry)) + return (ARCHIVE_OK); + + /* If no atime was specified, use start time instead. */ + /* In theory, it would be marginally more correct to use + * time(NULL) here, but that would cost us an extra syscall + * for little gain. */ + if (archive_entry_atime_is_set(a->entry)) { + atime = archive_entry_atime(a->entry); + atime_nsec = archive_entry_atime_nsec(a->entry); + } + + /* + * If you have struct stat.st_birthtime, we assume BSD birthtime + * semantics, in which {f,l,}utimes() updates birthtime to earliest + * mtime. So we set the time twice, first using the birthtime, + * then using the mtime. + */ +#if HAVE_STRUCT_STAT_ST_BIRTHTIME + /* If birthtime is set, flush that through to disk first. */ + if (archive_entry_birthtime_is_set(a->entry)) + if (set_time(a->fd, a->mode, a->name, atime, atime_nsec, + archive_entry_birthtime(a->entry), + archive_entry_birthtime_nsec(a->entry))) { + archive_set_error(&a->archive, errno, + "Can't update time for %s", + a->name); + return (ARCHIVE_WARN); + } +#endif + + if (archive_entry_mtime_is_set(a->entry)) { + mtime = archive_entry_mtime(a->entry); + mtime_nsec = archive_entry_mtime_nsec(a->entry); + } + if (set_time(a->fd, a->mode, a->name, + atime, atime_nsec, mtime, mtime_nsec)) { + archive_set_error(&a->archive, errno, + "Can't update time for %s", + a->name); + return (ARCHIVE_WARN); + } + + /* + * Note: POSIX does not provide a portable way to restore ctime. + * (Apart from resetting the system clock, which is distasteful.) + * So, any restoration of ctime will necessarily be OS-specific. + */ + + return (ARCHIVE_OK); +} + +static int +set_mode(struct archive_write_disk *a, int mode) +{ + int r = ARCHIVE_OK; + mode &= 07777; /* Strip off file type bits. */ + + if (a->todo & TODO_SGID_CHECK) { + /* + * If we don't know the GID is right, we must stat() + * to verify it. We can't just check the GID of this + * process, since systems sometimes set GID from + * the enclosing dir or based on ACLs. + */ + if ((r = _archive_write_disk_lazy_stat(a)) != ARCHIVE_OK) + return (r); + if (a->pst->st_gid != a->gid) { + mode &= ~ S_ISGID; +#if !defined(_WIN32) || defined(__CYGWIN__) + if (a->flags & ARCHIVE_EXTRACT_OWNER) { + /* + * This is only an error if you + * requested owner restore. If you + * didn't, we'll try to restore + * sgid/suid, but won't consider it a + * problem if we can't. + */ + archive_set_error(&a->archive, -1, + "Can't restore SGID bit"); + r = ARCHIVE_WARN; + } +#endif + } + /* While we're here, double-check the UID. */ + if (a->pst->st_uid != a->uid + && (a->todo & TODO_SUID)) { + mode &= ~ S_ISUID; +#if !defined(_WIN32) || defined(__CYGWIN__) + if (a->flags & ARCHIVE_EXTRACT_OWNER) { + archive_set_error(&a->archive, -1, + "Can't restore SUID bit"); + r = ARCHIVE_WARN; + } +#endif + } + a->todo &= ~TODO_SGID_CHECK; + a->todo &= ~TODO_SUID_CHECK; + } else if (a->todo & TODO_SUID_CHECK) { + /* + * If we don't know the UID is right, we can just check + * the user, since all systems set the file UID from + * the process UID. + */ + if (a->user_uid != a->uid) { + mode &= ~ S_ISUID; +#if !defined(_WIN32) || defined(__CYGWIN__) + if (a->flags & ARCHIVE_EXTRACT_OWNER) { + archive_set_error(&a->archive, -1, + "Can't make file SUID"); + r = ARCHIVE_WARN; + } +#endif + } + a->todo &= ~TODO_SUID_CHECK; + } + + if (S_ISLNK(a->mode)) { +#ifdef HAVE_LCHMOD + /* + * If this is a symlink, use lchmod(). If the + * platform doesn't support lchmod(), just skip it. A + * platform that doesn't provide a way to set + * permissions on symlinks probably ignores + * permissions on symlinks, so a failure here has no + * impact. + */ + if (lchmod(a->name, mode) != 0) { + archive_set_error(&a->archive, errno, + "Can't set permissions to 0%o", (int)mode); + r = ARCHIVE_WARN; + } +#endif + } else if (!S_ISDIR(a->mode)) { + /* + * If it's not a symlink and not a dir, then use + * fchmod() or chmod(), depending on whether we have + * an fd. Dirs get their perms set during the + * post-extract fixup, which is handled elsewhere. + */ +#ifdef HAVE_FCHMOD + if (a->fd >= 0) { + if (fchmod(a->fd, mode) != 0) { + archive_set_error(&a->archive, errno, + "Can't set permissions to 0%o", (int)mode); + r = ARCHIVE_WARN; + } + } else +#endif + /* If this platform lacks fchmod(), then + * we'll just use chmod(). */ + if (chmod(a->name, mode) != 0) { + archive_set_error(&a->archive, errno, + "Can't set permissions to 0%o", (int)mode); + r = ARCHIVE_WARN; + } + } + return (r); +} + +static int +set_fflags(struct archive_write_disk *a) +{ + struct fixup_entry *le; + unsigned long set, clear; + int r; + int critical_flags; + mode_t mode = archive_entry_mode(a->entry); + + /* + * Make 'critical_flags' hold all file flags that can't be + * immediately restored. For example, on BSD systems, + * SF_IMMUTABLE prevents hardlinks from being created, so + * should not be set until after any hardlinks are created. To + * preserve some semblance of portability, this uses #ifdef + * extensively. Ugly, but it works. + * + * Yes, Virginia, this does create a security race. It's mitigated + * somewhat by the practice of creating dirs 0700 until the extract + * is done, but it would be nice if we could do more than that. + * People restoring critical file systems should be wary of + * other programs that might try to muck with files as they're + * being restored. + */ + /* Hopefully, the compiler will optimize this mess into a constant. */ + critical_flags = 0; +#ifdef SF_IMMUTABLE + critical_flags |= SF_IMMUTABLE; +#endif +#ifdef UF_IMMUTABLE + critical_flags |= UF_IMMUTABLE; +#endif +#ifdef SF_APPEND + critical_flags |= SF_APPEND; +#endif +#ifdef UF_APPEND + critical_flags |= UF_APPEND; +#endif +#ifdef EXT2_APPEND_FL + critical_flags |= EXT2_APPEND_FL; +#endif +#ifdef EXT2_IMMUTABLE_FL + critical_flags |= EXT2_IMMUTABLE_FL; +#endif + + if (a->todo & TODO_FFLAGS) { + archive_entry_fflags(a->entry, &set, &clear); + + /* + * The first test encourages the compiler to eliminate + * all of this if it's not necessary. + */ + if ((critical_flags != 0) && (set & critical_flags)) { + le = current_fixup(a, a->name); + le->fixup |= TODO_FFLAGS; + le->fflags_set = set; + /* Store the mode if it's not already there. */ + if ((le->fixup & TODO_MODE) == 0) + le->mode = mode; + } else { + r = set_fflags_platform(a, a->fd, + a->name, mode, set, clear); + if (r != ARCHIVE_OK) + return (r); + } + } + return (ARCHIVE_OK); +} + + +#if ( defined(HAVE_LCHFLAGS) || defined(HAVE_CHFLAGS) || defined(HAVE_FCHFLAGS) ) && defined(HAVE_STRUCT_STAT_ST_FLAGS) +/* + * BSD reads flags using stat() and sets them with one of {f,l,}chflags() + */ +static int +set_fflags_platform(struct archive_write_disk *a, int fd, const char *name, + mode_t mode, unsigned long set, unsigned long clear) +{ + int r; + + (void)mode; /* UNUSED */ + if (set == 0 && clear == 0) + return (ARCHIVE_OK); + + /* + * XXX Is the stat here really necessary? Or can I just use + * the 'set' flags directly? In particular, I'm not sure + * about the correct approach if we're overwriting an existing + * file that already has flags on it. XXX + */ + if ((r = _archive_write_disk_lazy_stat(a)) != ARCHIVE_OK) + return (r); + + a->st.st_flags &= ~clear; + a->st.st_flags |= set; +#ifdef HAVE_FCHFLAGS + /* If platform has fchflags() and we were given an fd, use it. */ + if (fd >= 0 && fchflags(fd, a->st.st_flags) == 0) + return (ARCHIVE_OK); +#endif + /* + * If we can't use the fd to set the flags, we'll use the + * pathname to set flags. We prefer lchflags() but will use + * chflags() if we must. + */ +#ifdef HAVE_LCHFLAGS + if (lchflags(name, a->st.st_flags) == 0) + return (ARCHIVE_OK); +#elif defined(HAVE_CHFLAGS) + if (S_ISLNK(a->st.st_mode)) { + archive_set_error(&a->archive, errno, + "Can't set file flags on symlink."); + return (ARCHIVE_WARN); + } + if (chflags(name, a->st.st_flags) == 0) + return (ARCHIVE_OK); +#endif + archive_set_error(&a->archive, errno, + "Failed to set file flags"); + return (ARCHIVE_WARN); +} + +#elif defined(EXT2_IOC_GETFLAGS) && defined(EXT2_IOC_SETFLAGS) +/* + * Linux uses ioctl() to read and write file flags. + */ +static int +set_fflags_platform(struct archive_write_disk *a, int fd, const char *name, + mode_t mode, unsigned long set, unsigned long clear) +{ + int ret; + int myfd = fd; + unsigned long newflags, oldflags; + unsigned long sf_mask = 0; + + if (set == 0 && clear == 0) + return (ARCHIVE_OK); + /* Only regular files and dirs can have flags. */ + if (!S_ISREG(mode) && !S_ISDIR(mode)) + return (ARCHIVE_OK); + + /* If we weren't given an fd, open it ourselves. */ + if (myfd < 0) + myfd = open(name, O_RDONLY | O_NONBLOCK | O_BINARY); + if (myfd < 0) + return (ARCHIVE_OK); + + /* + * Linux has no define for the flags that are only settable by + * the root user. This code may seem a little complex, but + * there seem to be some Linux systems that lack these + * defines. (?) The code below degrades reasonably gracefully + * if sf_mask is incomplete. + */ +#ifdef EXT2_IMMUTABLE_FL + sf_mask |= EXT2_IMMUTABLE_FL; +#endif +#ifdef EXT2_APPEND_FL + sf_mask |= EXT2_APPEND_FL; +#endif + /* + * XXX As above, this would be way simpler if we didn't have + * to read the current flags from disk. XXX + */ + ret = ARCHIVE_OK; + /* Try setting the flags as given. */ + if (ioctl(myfd, EXT2_IOC_GETFLAGS, &oldflags) >= 0) { + newflags = (oldflags & ~clear) | set; + if (ioctl(myfd, EXT2_IOC_SETFLAGS, &newflags) >= 0) + goto cleanup; + if (errno != EPERM) + goto fail; + } + /* If we couldn't set all the flags, try again with a subset. */ + if (ioctl(myfd, EXT2_IOC_GETFLAGS, &oldflags) >= 0) { + newflags &= ~sf_mask; + oldflags &= sf_mask; + newflags |= oldflags; + if (ioctl(myfd, EXT2_IOC_SETFLAGS, &newflags) >= 0) + goto cleanup; + } + /* We couldn't set the flags, so report the failure. */ +fail: + archive_set_error(&a->archive, errno, + "Failed to set file flags"); + ret = ARCHIVE_WARN; +cleanup: + if (fd < 0) + close(myfd); + return (ret); +} + +#else + +/* + * Of course, some systems have neither BSD chflags() nor Linux' flags + * support through ioctl(). + */ +static int +set_fflags_platform(struct archive_write_disk *a, int fd, const char *name, + mode_t mode, unsigned long set, unsigned long clear) +{ + (void)a; /* UNUSED */ + (void)fd; /* UNUSED */ + (void)name; /* UNUSED */ + (void)mode; /* UNUSED */ + (void)set; /* UNUSED */ + (void)clear; /* UNUSED */ + return (ARCHIVE_OK); +} + +#endif /* __linux */ + +#ifndef HAVE_POSIX_ACL +/* Default empty function body to satisfy mainline code. */ +static int +set_acls(struct archive_write_disk *a) +{ + (void)a; /* UNUSED */ + return (ARCHIVE_OK); +} + +#else + +/* + * XXX TODO: What about ACL types other than ACCESS and DEFAULT? + */ +static int +set_acls(struct archive_write_disk *a) +{ + int ret; + + ret = set_acl(a, a->fd, a->entry, ACL_TYPE_ACCESS, + ARCHIVE_ENTRY_ACL_TYPE_ACCESS, "access"); + if (ret != ARCHIVE_OK) + return (ret); + ret = set_acl(a, a->fd, a->entry, ACL_TYPE_DEFAULT, + ARCHIVE_ENTRY_ACL_TYPE_DEFAULT, "default"); + return (ret); +} + + +static int +set_acl(struct archive_write_disk *a, int fd, struct archive_entry *entry, + acl_type_t acl_type, int ae_requested_type, const char *tname) +{ + acl_t acl; + acl_entry_t acl_entry; + acl_permset_t acl_permset; + int ret; + int ae_type, ae_permset, ae_tag, ae_id; + uid_t ae_uid; + gid_t ae_gid; + const char *ae_name; + int entries; + const char *name; + + ret = ARCHIVE_OK; + entries = archive_entry_acl_reset(entry, ae_requested_type); + if (entries == 0) + return (ARCHIVE_OK); + acl = acl_init(entries); + while (archive_entry_acl_next(entry, ae_requested_type, &ae_type, + &ae_permset, &ae_tag, &ae_id, &ae_name) == ARCHIVE_OK) { + acl_create_entry(&acl, &acl_entry); + + switch (ae_tag) { + case ARCHIVE_ENTRY_ACL_USER: + acl_set_tag_type(acl_entry, ACL_USER); + ae_uid = a->lookup_uid(a->lookup_uid_data, + ae_name, ae_id); + acl_set_qualifier(acl_entry, &ae_uid); + break; + case ARCHIVE_ENTRY_ACL_GROUP: + acl_set_tag_type(acl_entry, ACL_GROUP); + ae_gid = a->lookup_gid(a->lookup_gid_data, + ae_name, ae_id); + acl_set_qualifier(acl_entry, &ae_gid); + break; + case ARCHIVE_ENTRY_ACL_USER_OBJ: + acl_set_tag_type(acl_entry, ACL_USER_OBJ); + break; + case ARCHIVE_ENTRY_ACL_GROUP_OBJ: + acl_set_tag_type(acl_entry, ACL_GROUP_OBJ); + break; + case ARCHIVE_ENTRY_ACL_MASK: + acl_set_tag_type(acl_entry, ACL_MASK); + break; + case ARCHIVE_ENTRY_ACL_OTHER: + acl_set_tag_type(acl_entry, ACL_OTHER); + break; + default: + /* XXX */ + break; + } + + acl_get_permset(acl_entry, &acl_permset); + acl_clear_perms(acl_permset); + if (ae_permset & ARCHIVE_ENTRY_ACL_EXECUTE) + acl_add_perm(acl_permset, ACL_EXECUTE); + if (ae_permset & ARCHIVE_ENTRY_ACL_WRITE) + acl_add_perm(acl_permset, ACL_WRITE); + if (ae_permset & ARCHIVE_ENTRY_ACL_READ) + acl_add_perm(acl_permset, ACL_READ); + } + + name = archive_entry_pathname(entry); + + /* Try restoring the ACL through 'fd' if we can. */ +#if HAVE_ACL_SET_FD + if (fd >= 0 && acl_type == ACL_TYPE_ACCESS && acl_set_fd(fd, acl) == 0) + ret = ARCHIVE_OK; + else +#else +#if HAVE_ACL_SET_FD_NP + if (fd >= 0 && acl_set_fd_np(fd, acl, acl_type) == 0) + ret = ARCHIVE_OK; + else +#endif +#endif + if (acl_set_file(name, acl_type, acl) != 0) { + archive_set_error(&a->archive, errno, "Failed to set %s acl", tname); + ret = ARCHIVE_WARN; + } + acl_free(acl); + return (ret); +} +#endif + +#if HAVE_LSETXATTR +/* + * Restore extended attributes - Linux implementation + */ +static int +set_xattrs(struct archive_write_disk *a) +{ + struct archive_entry *entry = a->entry; + static int warning_done = 0; + int ret = ARCHIVE_OK; + int i = archive_entry_xattr_reset(entry); + + while (i--) { + const char *name; + const void *value; + size_t size; + archive_entry_xattr_next(entry, &name, &value, &size); + if (name != NULL && + strncmp(name, "xfsroot.", 8) != 0 && + strncmp(name, "system.", 7) != 0) { + int e; +#if HAVE_FSETXATTR + if (a->fd >= 0) + e = fsetxattr(a->fd, name, value, size, 0); + else +#endif + { + e = lsetxattr(archive_entry_pathname(entry), + name, value, size, 0); + } + if (e == -1) { + if (errno == ENOTSUP) { + if (!warning_done) { + warning_done = 1; + archive_set_error(&a->archive, errno, + "Cannot restore extended " + "attributes on this file " + "system"); + } + } else + archive_set_error(&a->archive, errno, + "Failed to set extended attribute"); + ret = ARCHIVE_WARN; + } + } else { + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Invalid extended attribute encountered"); + ret = ARCHIVE_WARN; + } + } + return (ret); +} +#elif HAVE_EXTATTR_SET_FILE +/* + * Restore extended attributes - FreeBSD implementation + */ +static int +set_xattrs(struct archive_write_disk *a) +{ + struct archive_entry *entry = a->entry; + static int warning_done = 0; + int ret = ARCHIVE_OK; + int i = archive_entry_xattr_reset(entry); + + while (i--) { + const char *name; + const void *value; + size_t size; + archive_entry_xattr_next(entry, &name, &value, &size); + if (name != NULL) { + int e; + int namespace; + + if (strncmp(name, "user.", 5) == 0) { + /* "user." attributes go to user namespace */ + name += 5; + namespace = EXTATTR_NAMESPACE_USER; + } else { + /* Warn about other extended attributes. */ + archive_set_error(&a->archive, + ARCHIVE_ERRNO_FILE_FORMAT, + "Can't restore extended attribute ``%s''", + name); + ret = ARCHIVE_WARN; + continue; + } + errno = 0; +#if HAVE_EXTATTR_SET_FD + if (a->fd >= 0) + e = extattr_set_fd(a->fd, namespace, name, value, size); + else +#endif + /* TODO: should we use extattr_set_link() instead? */ + { + e = extattr_set_file(archive_entry_pathname(entry), + namespace, name, value, size); + } + if (e != (int)size) { + if (errno == ENOTSUP) { + if (!warning_done) { + warning_done = 1; + archive_set_error(&a->archive, errno, + "Cannot restore extended " + "attributes on this file " + "system"); + } + } else { + archive_set_error(&a->archive, errno, + "Failed to set extended attribute"); + } + + ret = ARCHIVE_WARN; + } + } + } + return (ret); +} +#else +/* + * Restore extended attributes - stub implementation for unsupported systems + */ +static int +set_xattrs(struct archive_write_disk *a) +{ + static int warning_done = 0; + + /* If there aren't any extended attributes, then it's okay not + * to extract them, otherwise, issue a single warning. */ + if (archive_entry_xattr_count(a->entry) != 0 && !warning_done) { + warning_done = 1; + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Cannot restore extended attributes on this system"); + return (ARCHIVE_WARN); + } + /* Warning was already emitted; suppress further warnings. */ + return (ARCHIVE_OK); +} +#endif + + +/* + * Trivial implementations of gid/uid lookup functions. + * These are normally overridden by the client, but these stub + * versions ensure that we always have something that works. + */ +static gid_t +trivial_lookup_gid(void *private_data, const char *gname, gid_t gid) +{ + (void)private_data; /* UNUSED */ + (void)gname; /* UNUSED */ + return (gid); +} + +static uid_t +trivial_lookup_uid(void *private_data, const char *uname, uid_t uid) +{ + (void)private_data; /* UNUSED */ + (void)uname; /* UNUSED */ + return (uid); +} + +/* + * Test if file on disk is older than entry. + */ +static int +older(struct stat *st, struct archive_entry *entry) +{ + /* First, test the seconds and return if we have a definite answer. */ + /* Definitely older. */ + if (st->st_mtime < archive_entry_mtime(entry)) + return (1); + /* Definitely younger. */ + if (st->st_mtime > archive_entry_mtime(entry)) + return (0); + /* If this platform supports fractional seconds, try those. */ +#if HAVE_STRUCT_STAT_ST_MTIMESPEC_TV_NSEC + /* Definitely older. */ + if (st->st_mtimespec.tv_nsec < archive_entry_mtime_nsec(entry)) + return (1); +#elif HAVE_STRUCT_STAT_ST_MTIM_TV_NSEC + /* Definitely older. */ + if (st->st_mtim.tv_nsec < archive_entry_mtime_nsec(entry)) + return (1); +#elif HAVE_STRUCT_STAT_ST_MTIME_N + /* older. */ + if (st->st_mtime_n < archive_entry_mtime_nsec(entry)) + return (1); +#elif HAVE_STRUCT_STAT_ST_UMTIME + /* older. */ + if (st->st_umtime * 1000 < archive_entry_mtime_nsec(entry)) + return (1); +#elif HAVE_STRUCT_STAT_ST_MTIME_USEC + /* older. */ + if (st->st_mtime_usec * 1000 < archive_entry_mtime_nsec(entry)) + return (1); +#else + /* This system doesn't have high-res timestamps. */ +#endif + /* Same age or newer, so not older. */ + return (0); +} diff --git a/lib/libarchive/archive_write_disk_private.h b/lib/libarchive/archive_write_disk_private.h new file mode 100644 index 000000000..707c0cf03 --- /dev/null +++ b/lib/libarchive/archive_write_disk_private.h @@ -0,0 +1,38 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD: head/lib/libarchive/archive_write_disk_private.h 201086 2009-12-28 02:17:53Z kientzle $ + */ + +#ifndef __LIBARCHIVE_BUILD +#error This header is only to be used internally to libarchive. +#endif + +#ifndef ARCHIVE_WRITE_DISK_PRIVATE_H_INCLUDED +#define ARCHIVE_WRITE_DISK_PRIVATE_H_INCLUDED + +struct archive_write_disk; + +#endif diff --git a/lib/libarchive/archive_write_disk_set_standard_lookup.c b/lib/libarchive/archive_write_disk_set_standard_lookup.c new file mode 100644 index 000000000..8f8260f07 --- /dev/null +++ b/lib/libarchive/archive_write_disk_set_standard_lookup.c @@ -0,0 +1,252 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_write_disk_set_standard_lookup.c 201083 2009-12-28 02:09:57Z kientzle $"); + +#ifdef HAVE_SYS_TYPES_H +#include +#endif +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_GRP_H +#include +#endif +#ifdef HAVE_PWD_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif + +#include "archive.h" +#include "archive_private.h" +#include "archive_read_private.h" +#include "archive_write_disk_private.h" + +struct bucket { + char *name; + int hash; + id_t id; +}; + +static const size_t cache_size = 127; +static unsigned int hash(const char *); +static gid_t lookup_gid(void *, const char *uname, gid_t); +static uid_t lookup_uid(void *, const char *uname, uid_t); +static void cleanup(void *); + +/* + * Installs functions that use getpwnam()/getgrnam()---along with + * a simple cache to accelerate such lookups---into the archive_write_disk + * object. This is in a separate file because getpwnam()/getgrnam() + * can pull in a LOT of library code (including NIS/LDAP functions, which + * pull in DNS resolveers, etc). This can easily top 500kB, which makes + * it inappropriate for some space-constrained applications. + * + * Applications that are size-sensitive may want to just use the + * real default functions (defined in archive_write_disk.c) that just + * use the uid/gid without the lookup. Or define your own custom functions + * if you prefer. + * + * TODO: Replace these hash tables with simpler move-to-front LRU + * lists with a bounded size (128 items?). The hash is a bit faster, + * but has a bad pathology in which it thrashes a single bucket. Even + * walking a list of 128 items is a lot faster than calling + * getpwnam()! + */ +int +archive_write_disk_set_standard_lookup(struct archive *a) +{ + struct bucket *ucache = malloc(cache_size * sizeof(struct bucket)); + struct bucket *gcache = malloc(cache_size * sizeof(struct bucket)); + memset(ucache, 0, cache_size * sizeof(struct bucket)); + memset(gcache, 0, cache_size * sizeof(struct bucket)); + archive_write_disk_set_group_lookup(a, gcache, lookup_gid, cleanup); + archive_write_disk_set_user_lookup(a, ucache, lookup_uid, cleanup); + return (ARCHIVE_OK); +} + +static gid_t +lookup_gid(void *private_data, const char *gname, gid_t gid) +{ + int h; + struct bucket *b; + struct bucket *gcache = (struct bucket *)private_data; + + /* If no gname, just use the gid provided. */ + if (gname == NULL || *gname == '\0') + return (gid); + + /* Try to find gname in the cache. */ + h = hash(gname); + b = &gcache[h % cache_size ]; + if (b->name != NULL && b->hash == h && strcmp(gname, b->name) == 0) + return ((gid_t)b->id); + + /* Free the cache slot for a new entry. */ + if (b->name != NULL) + free(b->name); + b->name = strdup(gname); + /* Note: If strdup fails, that's okay; we just won't cache. */ + b->hash = h; +#if HAVE_GRP_H + { + char _buffer[128]; + size_t bufsize = 128; + char *buffer = _buffer; + struct group grent, *result; + int r; + + for (;;) { + result = &grent; /* Old getgrnam_r ignores last arg. */ +#if defined(HAVE_GETGRNAM_R) + r = getgrnam_r(gname, &grent, buffer, bufsize, &result); +#else + result = getgrnam(gname); + r = errno; +#endif + if (r == 0) + break; + if (r != ERANGE) + break; + bufsize *= 2; + if (buffer != _buffer) + free(buffer); + buffer = malloc(bufsize); + if (buffer == NULL) + break; + } + if (result != NULL) + gid = result->gr_gid; + if (buffer != _buffer) + free(buffer); + } +#elif defined(_WIN32) && !defined(__CYGWIN__) + /* TODO: do a gname->gid lookup for Windows. */ +#else + #error No way to perform gid lookups on this platform +#endif + b->id = gid; + + return (gid); +} + +static uid_t +lookup_uid(void *private_data, const char *uname, uid_t uid) +{ + int h; + struct bucket *b; + struct bucket *ucache = (struct bucket *)private_data; + + /* If no uname, just use the uid provided. */ + if (uname == NULL || *uname == '\0') + return (uid); + + /* Try to find uname in the cache. */ + h = hash(uname); + b = &ucache[h % cache_size ]; + if (b->name != NULL && b->hash == h && strcmp(uname, b->name) == 0) + return ((uid_t)b->id); + + /* Free the cache slot for a new entry. */ + if (b->name != NULL) + free(b->name); + b->name = strdup(uname); + /* Note: If strdup fails, that's okay; we just won't cache. */ + b->hash = h; +#if HAVE_PWD_H + { + char _buffer[128]; + size_t bufsize = 128; + char *buffer = _buffer; + struct passwd pwent, *result; + int r; + + for (;;) { + result = &pwent; /* Old getpwnam_r ignores last arg. */ +#if defined(HAVE_GETPWNAM_R) + r = getpwnam_r(uname, &pwent, buffer, bufsize, &result); +#else + result = getpwnam(uname); + r = errno; +#endif + if (r == 0) + break; + if (r != ERANGE) + break; + bufsize *= 2; + if (buffer != _buffer) + free(buffer); + buffer = malloc(bufsize); + if (buffer == NULL) + break; + } + if (result != NULL) + uid = result->pw_uid; + if (buffer != _buffer) + free(buffer); + } +#elif defined(_WIN32) && !defined(__CYGWIN__) + /* TODO: do a uname->uid lookup for Windows. */ +#else + #error No way to look up uids on this platform +#endif + b->id = uid; + + return (uid); +} + +static void +cleanup(void *private) +{ + size_t i; + struct bucket *cache = (struct bucket *)private; + + for (i = 0; i < cache_size; i++) + free(cache[i].name); + free(cache); +} + + +static unsigned int +hash(const char *p) +{ + /* A 32-bit version of Peter Weinberger's (PJW) hash algorithm, + as used by ELF for hashing function names. */ + unsigned g, h = 0; + while (*p != '\0') { + h = (h << 4) + *p++; + if ((g = h & 0xF0000000) != 0) { + h ^= g >> 24; + h &= 0x0FFFFFFF; + } + } + return h; +} diff --git a/lib/libarchive/archive_write_open_fd.c b/lib/libarchive/archive_write_open_fd.c new file mode 100644 index 000000000..3a6039871 --- /dev/null +++ b/lib/libarchive/archive_write_open_fd.c @@ -0,0 +1,141 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_write_open_fd.c 201093 2009-12-28 02:28:44Z kientzle $"); + +#ifdef HAVE_SYS_STAT_H +#include +#endif +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_FCNTL_H +#include +#endif +#ifdef HAVE_IO_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#ifdef HAVE_UNISTD_H +#include +#endif + +#include "archive.h" + +struct write_fd_data { + off_t offset; + int fd; +}; + +static int file_close(struct archive *, void *); +static int file_open(struct archive *, void *); +static ssize_t file_write(struct archive *, void *, const void *buff, size_t); + +int +archive_write_open_fd(struct archive *a, int fd) +{ + struct write_fd_data *mine; + + mine = (struct write_fd_data *)malloc(sizeof(*mine)); + if (mine == NULL) { + archive_set_error(a, ENOMEM, "No memory"); + return (ARCHIVE_FATAL); + } + mine->fd = fd; +#if defined(__CYGWIN__) || defined(_WIN32) + setmode(mine->fd, O_BINARY); +#endif + return (archive_write_open(a, mine, + file_open, file_write, file_close)); +} + +static int +file_open(struct archive *a, void *client_data) +{ + struct write_fd_data *mine; + struct stat st; + + mine = (struct write_fd_data *)client_data; + + if (fstat(mine->fd, &st) != 0) { + archive_set_error(a, errno, "Couldn't stat fd %d", mine->fd); + return (ARCHIVE_FATAL); + } + + /* + * If this is a regular file, don't add it to itself. + */ + if (S_ISREG(st.st_mode)) + archive_write_set_skip_file(a, st.st_dev, st.st_ino); + + /* + * If client hasn't explicitly set the last block handling, + * then set it here. + */ + if (archive_write_get_bytes_in_last_block(a) < 0) { + /* If the output is a block or character device, fifo, + * or stdout, pad the last block, otherwise leave it + * unpadded. */ + if (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode) || + S_ISFIFO(st.st_mode) || (mine->fd == 1)) + /* Last block will be fully padded. */ + archive_write_set_bytes_in_last_block(a, 0); + else + archive_write_set_bytes_in_last_block(a, 1); + } + + return (ARCHIVE_OK); +} + +static ssize_t +file_write(struct archive *a, void *client_data, const void *buff, size_t length) +{ + struct write_fd_data *mine; + ssize_t bytesWritten; + + mine = (struct write_fd_data *)client_data; + bytesWritten = write(mine->fd, buff, length); + if (bytesWritten <= 0) { + archive_set_error(a, errno, "Write error"); + return (-1); + } + return (bytesWritten); +} + +static int +file_close(struct archive *a, void *client_data) +{ + struct write_fd_data *mine = (struct write_fd_data *)client_data; + + (void)a; /* UNUSED */ + free(mine); + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_write_open_file.c b/lib/libarchive/archive_write_open_file.c new file mode 100644 index 000000000..5c0c737f8 --- /dev/null +++ b/lib/libarchive/archive_write_open_file.c @@ -0,0 +1,105 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: src/lib/libarchive/archive_write_open_file.c,v 1.19 2007/01/09 08:05:56 kientzle Exp $"); + +#ifdef HAVE_SYS_STAT_H +#include +#endif +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_FCNTL_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#ifdef HAVE_UNISTD_H +#include +#endif + +#include "archive.h" + +struct write_FILE_data { + FILE *f; +}; + +static int file_close(struct archive *, void *); +static int file_open(struct archive *, void *); +static ssize_t file_write(struct archive *, void *, const void *buff, size_t); + +int +archive_write_open_FILE(struct archive *a, FILE *f) +{ + struct write_FILE_data *mine; + + mine = (struct write_FILE_data *)malloc(sizeof(*mine)); + if (mine == NULL) { + archive_set_error(a, ENOMEM, "No memory"); + return (ARCHIVE_FATAL); + } + mine->f = f; + return (archive_write_open(a, mine, + file_open, file_write, file_close)); +} + +static int +file_open(struct archive *a, void *client_data) +{ + (void)a; /* UNUSED */ + (void)client_data; /* UNUSED */ + + return (ARCHIVE_OK); +} + +static ssize_t +file_write(struct archive *a, void *client_data, const void *buff, size_t length) +{ + struct write_FILE_data *mine; + size_t bytesWritten; + + mine = client_data; + bytesWritten = fwrite(buff, 1, length, mine->f); + if (bytesWritten < length) { + archive_set_error(a, errno, "Write error"); + return (-1); + } + return (bytesWritten); +} + +static int +file_close(struct archive *a, void *client_data) +{ + struct write_FILE_data *mine = client_data; + + (void)a; /* UNUSED */ + free(mine); + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_write_open_filename.c b/lib/libarchive/archive_write_open_filename.c new file mode 100644 index 000000000..6a9c77816 --- /dev/null +++ b/lib/libarchive/archive_write_open_filename.c @@ -0,0 +1,162 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_write_open_filename.c 191165 2009-04-17 00:39:35Z kientzle $"); + +#ifdef HAVE_SYS_STAT_H +#include +#endif +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_FCNTL_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#ifdef HAVE_UNISTD_H +#include +#endif + +#include "archive.h" + +#ifndef O_BINARY +#define O_BINARY 0 +#endif + +struct write_file_data { + int fd; + char filename[1]; +}; + +static int file_close(struct archive *, void *); +static int file_open(struct archive *, void *); +static ssize_t file_write(struct archive *, void *, const void *buff, size_t); + +int +archive_write_open_file(struct archive *a, const char *filename) +{ + return (archive_write_open_filename(a, filename)); +} + +int +archive_write_open_filename(struct archive *a, const char *filename) +{ + struct write_file_data *mine; + + if (filename == NULL || filename[0] == '\0') + return (archive_write_open_fd(a, 1)); + + mine = (struct write_file_data *)malloc(sizeof(*mine) + strlen(filename)); + if (mine == NULL) { + archive_set_error(a, ENOMEM, "No memory"); + return (ARCHIVE_FATAL); + } + strcpy(mine->filename, filename); + mine->fd = -1; + return (archive_write_open(a, mine, + file_open, file_write, file_close)); +} + +static int +file_open(struct archive *a, void *client_data) +{ + int flags; + struct write_file_data *mine; + struct stat st; + + mine = (struct write_file_data *)client_data; + flags = O_WRONLY | O_CREAT | O_TRUNC | O_BINARY; + + /* + * Open the file. + */ + mine->fd = open(mine->filename, flags, 0666); + if (mine->fd < 0) { + archive_set_error(a, errno, "Failed to open '%s'", + mine->filename); + return (ARCHIVE_FATAL); + } + + if (fstat(mine->fd, &st) != 0) { + archive_set_error(a, errno, "Couldn't stat '%s'", + mine->filename); + return (ARCHIVE_FATAL); + } + + /* + * Set up default last block handling. + */ + if (archive_write_get_bytes_in_last_block(a) < 0) { + if (S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode) || + S_ISFIFO(st.st_mode)) + /* Pad last block when writing to device or FIFO. */ + archive_write_set_bytes_in_last_block(a, 0); + else + /* Don't pad last block otherwise. */ + archive_write_set_bytes_in_last_block(a, 1); + } + + /* + * If the output file is a regular file, don't add it to + * itself. If it's a device file, it's okay to add the device + * entry to the output archive. + */ + if (S_ISREG(st.st_mode)) + archive_write_set_skip_file(a, st.st_dev, st.st_ino); + + return (ARCHIVE_OK); +} + +static ssize_t +file_write(struct archive *a, void *client_data, const void *buff, size_t length) +{ + struct write_file_data *mine; + ssize_t bytesWritten; + + mine = (struct write_file_data *)client_data; + bytesWritten = write(mine->fd, buff, length); + if (bytesWritten <= 0) { + archive_set_error(a, errno, "Write error"); + return (-1); + } + return (bytesWritten); +} + +static int +file_close(struct archive *a, void *client_data) +{ + struct write_file_data *mine = (struct write_file_data *)client_data; + + (void)a; /* UNUSED */ + close(mine->fd); + free(mine); + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_write_open_memory.c b/lib/libarchive/archive_write_open_memory.c new file mode 100644 index 000000000..d235ca01d --- /dev/null +++ b/lib/libarchive/archive_write_open_memory.c @@ -0,0 +1,126 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: src/lib/libarchive/archive_write_open_memory.c,v 1.3 2007/01/09 08:05:56 kientzle Exp $"); + +#include +#include +#include + +#include "archive.h" + +/* + * This is a little tricky. I used to allow the + * compression handling layer to fork the compressor, + * which means this write function gets invoked in + * a separate process. That would, of course, make it impossible + * to actually use the data stored into memory here. + * Fortunately, none of the compressors fork today and + * I'm reluctant to use that route in the future but, if + * forking compressors ever do reappear, this will have + * to get a lot more complicated. + */ + +struct write_memory_data { + size_t used; + size_t size; + size_t * client_size; + unsigned char * buff; +}; + +static int memory_write_close(struct archive *, void *); +static int memory_write_open(struct archive *, void *); +static ssize_t memory_write(struct archive *, void *, const void *buff, size_t); + +/* + * Client provides a pointer to a block of memory to receive + * the data. The 'size' param both tells us the size of the + * client buffer and lets us tell the client the final size. + */ +int +archive_write_open_memory(struct archive *a, void *buff, size_t buffSize, size_t *used) +{ + struct write_memory_data *mine; + + mine = (struct write_memory_data *)malloc(sizeof(*mine)); + if (mine == NULL) { + archive_set_error(a, ENOMEM, "No memory"); + return (ARCHIVE_FATAL); + } + memset(mine, 0, sizeof(*mine)); + mine->buff = buff; + mine->size = buffSize; + mine->client_size = used; + return (archive_write_open(a, mine, + memory_write_open, memory_write, memory_write_close)); +} + +static int +memory_write_open(struct archive *a, void *client_data) +{ + struct write_memory_data *mine; + mine = client_data; + mine->used = 0; + if (mine->client_size != NULL) + *mine->client_size = mine->used; + /* Disable padding if it hasn't been set explicitly. */ + if (-1 == archive_write_get_bytes_in_last_block(a)) + archive_write_set_bytes_in_last_block(a, 1); + return (ARCHIVE_OK); +} + +/* + * Copy the data into the client buffer. + * Note that we update mine->client_size on every write. + * In particular, this means the client can follow exactly + * how much has been written into their buffer at any time. + */ +static ssize_t +memory_write(struct archive *a, void *client_data, const void *buff, size_t length) +{ + struct write_memory_data *mine; + mine = client_data; + + if (mine->used + length > mine->size) { + archive_set_error(a, ENOMEM, "Buffer exhausted"); + return (ARCHIVE_FATAL); + } + memcpy(mine->buff + mine->used, buff, length); + mine->used += length; + if (mine->client_size != NULL) + *mine->client_size = mine->used; + return (length); +} + +static int +memory_write_close(struct archive *a, void *client_data) +{ + struct write_memory_data *mine; + (void)a; /* UNUSED */ + mine = client_data; + free(mine); + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_write_private.h b/lib/libarchive/archive_write_private.h new file mode 100644 index 000000000..cc7ad6ad1 --- /dev/null +++ b/lib/libarchive/archive_write_private.h @@ -0,0 +1,125 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD: head/lib/libarchive/archive_write_private.h 201155 2009-12-29 05:20:12Z kientzle $ + */ + +#ifndef __LIBARCHIVE_BUILD +#error This header is only to be used internally to libarchive. +#endif + +#ifndef ARCHIVE_WRITE_PRIVATE_H_INCLUDED +#define ARCHIVE_WRITE_PRIVATE_H_INCLUDED + +#include "archive.h" +#include "archive_string.h" +#include "archive_private.h" + +struct archive_write { + struct archive archive; + + /* Dev/ino of the archive being written. */ + dev_t skip_file_dev; +#ifndef __minix + int64_t skip_file_ino; +#else + ino_t skip_file_ino; +#endif + /* Utility: Pointer to a block of nulls. */ + const unsigned char *nulls; + size_t null_length; + + /* Callbacks to open/read/write/close archive stream. */ + archive_open_callback *client_opener; + archive_write_callback *client_writer; + archive_close_callback *client_closer; + void *client_data; + + /* + * Blocking information. Note that bytes_in_last_block is + * misleadingly named; I should find a better name. These + * control the final output from all compressors, including + * compression_none. + */ + int bytes_per_block; + int bytes_in_last_block; + + /* + * These control whether data within a gzip/bzip2 compressed + * stream gets padded or not. If pad_uncompressed is set, + * the data will be padded to a full block before being + * compressed. The pad_uncompressed_byte determines the value + * that will be used for padding. Note that these have no + * effect on compression "none." + */ + int pad_uncompressed; + int pad_uncompressed_byte; /* TODO: Support this. */ + + /* + * On write, the client just invokes an archive_write_set function + * which sets up the data here directly. + */ + struct { + void *data; + void *config; + int (*init)(struct archive_write *); + int (*options)(struct archive_write *, + const char *key, const char *value); + int (*finish)(struct archive_write *); + int (*write)(struct archive_write *, const void *, size_t); + } compressor; + + /* + * Pointers to format-specific functions for writing. They're + * initialized by archive_write_set_format_XXX() calls. + */ + void *format_data; + const char *format_name; + int (*format_init)(struct archive_write *); + int (*format_options)(struct archive_write *, + const char *key, const char *value); + int (*format_finish)(struct archive_write *); + int (*format_destroy)(struct archive_write *); + int (*format_finish_entry)(struct archive_write *); + int (*format_write_header)(struct archive_write *, + struct archive_entry *); + ssize_t (*format_write_data)(struct archive_write *, + const void *buff, size_t); +}; + +/* + * Utility function to format a USTAR header into a buffer. If + * "strict" is set, this tries to create the absolutely most portable + * version of a ustar header. If "strict" is set to 0, then it will + * relax certain requirements. + * + * Generally, format-specific declarations don't belong in this + * header; this is a rare example of a function that is shared by + * two very similar formats (ustar and pax). + */ +int +__archive_write_format_header_ustar(struct archive_write *, char buff[512], + struct archive_entry *, int tartype, int strict); + +#endif diff --git a/lib/libarchive/archive_write_set_compression_bzip2.c b/lib/libarchive/archive_write_set_compression_bzip2.c new file mode 100644 index 000000000..3aa2a8365 --- /dev/null +++ b/lib/libarchive/archive_write_set_compression_bzip2.c @@ -0,0 +1,412 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" + +__FBSDID("$FreeBSD: head/lib/libarchive/archive_write_set_compression_bzip2.c 201091 2009-12-28 02:22:41Z kientzle $"); + +#ifdef HAVE_ERRNO_H +#include +#endif +#include +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#ifdef HAVE_BZLIB_H +#include +#endif + +#include "archive.h" +#include "archive_private.h" +#include "archive_write_private.h" + +#ifndef HAVE_BZLIB_H +int +archive_write_set_compression_bzip2(struct archive *a) +{ + archive_set_error(a, ARCHIVE_ERRNO_MISC, + "bzip2 compression not supported on this platform"); + return (ARCHIVE_FATAL); +} +#else +/* Don't compile this if we don't have bzlib. */ + +struct private_data { + bz_stream stream; +#ifndef __minix + int64_t total_in; +#else + ssize_t total_in; +#endif + char *compressed; + size_t compressed_buffer_size; +}; + +struct private_config { + int compression_level; +}; + +/* + * Yuck. bzlib.h is not const-correct, so I need this one bit + * of ugly hackery to convert a const * pointer to a non-const pointer. + */ +#define SET_NEXT_IN(st,src) \ + (st)->stream.next_in = (char *)(uintptr_t)(const void *)(src) + +static int archive_compressor_bzip2_finish(struct archive_write *); +static int archive_compressor_bzip2_init(struct archive_write *); +static int archive_compressor_bzip2_options(struct archive_write *, + const char *, const char *); +static int archive_compressor_bzip2_write(struct archive_write *, + const void *, size_t); +static int drive_compressor(struct archive_write *, struct private_data *, + int finishing); + +/* + * Allocate, initialize and return an archive object. + */ +int +archive_write_set_compression_bzip2(struct archive *_a) +{ + struct archive_write *a = (struct archive_write *)_a; + struct private_config *config; + __archive_check_magic(&a->archive, ARCHIVE_WRITE_MAGIC, + ARCHIVE_STATE_NEW, "archive_write_set_compression_bzip2"); + config = malloc(sizeof(*config)); + if (config == NULL) { + archive_set_error(&a->archive, ENOMEM, "Out of memory"); + return (ARCHIVE_FATAL); + } + a->compressor.config = config; + a->compressor.finish = archive_compressor_bzip2_finish; + config->compression_level = 9; /* default */ + a->compressor.init = &archive_compressor_bzip2_init; + a->compressor.options = &archive_compressor_bzip2_options; + a->archive.compression_code = ARCHIVE_COMPRESSION_BZIP2; + a->archive.compression_name = "bzip2"; + return (ARCHIVE_OK); +} + +/* + * Setup callback. + */ +static int +archive_compressor_bzip2_init(struct archive_write *a) +{ + int ret; + struct private_data *state; + struct private_config *config; + + config = (struct private_config *)a->compressor.config; + if (a->client_opener != NULL) { + ret = (a->client_opener)(&a->archive, a->client_data); + if (ret != 0) + return (ret); + } + + state = (struct private_data *)malloc(sizeof(*state)); + if (state == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate data for compression"); + return (ARCHIVE_FATAL); + } + memset(state, 0, sizeof(*state)); + + state->compressed_buffer_size = a->bytes_per_block; + state->compressed = (char *)malloc(state->compressed_buffer_size); + + if (state->compressed == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate data for compression buffer"); + free(state); + return (ARCHIVE_FATAL); + } + + state->stream.next_out = state->compressed; + state->stream.avail_out = state->compressed_buffer_size; + a->compressor.write = archive_compressor_bzip2_write; + + /* Initialize compression library */ + ret = BZ2_bzCompressInit(&(state->stream), + config->compression_level, 0, 30); + if (ret == BZ_OK) { + a->compressor.data = state; + return (ARCHIVE_OK); + } + + /* Library setup failed: clean up. */ + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Internal error initializing compression library"); + free(state->compressed); + free(state); + + /* Override the error message if we know what really went wrong. */ + switch (ret) { + case BZ_PARAM_ERROR: + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Internal error initializing compression library: " + "invalid setup parameter"); + break; + case BZ_MEM_ERROR: + archive_set_error(&a->archive, ENOMEM, + "Internal error initializing compression library: " + "out of memory"); + break; + case BZ_CONFIG_ERROR: + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Internal error initializing compression library: " + "mis-compiled library"); + break; + } + + return (ARCHIVE_FATAL); + +} + +/* + * Set write options. + */ +static int +archive_compressor_bzip2_options(struct archive_write *a, const char *key, + const char *value) +{ + struct private_config *config; + + config = (struct private_config *)a->compressor.config; + if (strcmp(key, "compression-level") == 0) { + if (value == NULL || !(value[0] >= '0' && value[0] <= '9') || + value[1] != '\0') + return (ARCHIVE_WARN); + config->compression_level = value[0] - '0'; + /* Make '0' be a synonym for '1'. */ + /* This way, bzip2 compressor supports the same 0..9 + * range of levels as gzip. */ + if (config->compression_level < 1) + config->compression_level = 1; + return (ARCHIVE_OK); + } + + return (ARCHIVE_WARN); +} + +/* + * Write data to the compressed stream. + * + * Returns ARCHIVE_OK if all data written, error otherwise. + */ +static int +archive_compressor_bzip2_write(struct archive_write *a, const void *buff, + size_t length) +{ + struct private_data *state; + + state = (struct private_data *)a->compressor.data; + if (a->client_writer == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_PROGRAMMER, + "No write callback is registered? " + "This is probably an internal programming error."); + return (ARCHIVE_FATAL); + } + + /* Update statistics */ + state->total_in += length; + + /* Compress input data to output buffer */ + SET_NEXT_IN(state, buff); + state->stream.avail_in = length; + if (drive_compressor(a, state, 0)) + return (ARCHIVE_FATAL); + a->archive.file_position += length; + return (ARCHIVE_OK); +} + + +/* + * Finish the compression. + */ +static int +archive_compressor_bzip2_finish(struct archive_write *a) +{ + ssize_t block_length; + int ret; + struct private_data *state; + ssize_t target_block_length; + ssize_t bytes_written; + unsigned tocopy; + + ret = ARCHIVE_OK; + state = (struct private_data *)a->compressor.data; + if (state != NULL) { + if (a->client_writer == NULL) { + archive_set_error(&a->archive, + ARCHIVE_ERRNO_PROGRAMMER, + "No write callback is registered?\n" + "This is probably an internal programming error."); + ret = ARCHIVE_FATAL; + goto cleanup; + } + + /* By default, always pad the uncompressed data. */ + if (a->pad_uncompressed) { + tocopy = a->bytes_per_block - + (state->total_in % a->bytes_per_block); + while (tocopy > 0 && tocopy < (unsigned)a->bytes_per_block) { + SET_NEXT_IN(state, a->nulls); + state->stream.avail_in = tocopy < a->null_length ? + tocopy : a->null_length; + state->total_in += state->stream.avail_in; + tocopy -= state->stream.avail_in; + ret = drive_compressor(a, state, 0); + if (ret != ARCHIVE_OK) + goto cleanup; + } + } + + /* Finish compression cycle. */ + if ((ret = drive_compressor(a, state, 1))) + goto cleanup; + + /* Optionally, pad the final compressed block. */ + block_length = state->stream.next_out - state->compressed; + + /* Tricky calculation to determine size of last block. */ + if (a->bytes_in_last_block <= 0) + /* Default or Zero: pad to full block */ + target_block_length = a->bytes_per_block; + else + /* Round length to next multiple of bytes_in_last_block. */ + target_block_length = a->bytes_in_last_block * + ( (block_length + a->bytes_in_last_block - 1) / + a->bytes_in_last_block); + if (target_block_length > a->bytes_per_block) + target_block_length = a->bytes_per_block; + if (block_length < target_block_length) { + memset(state->stream.next_out, 0, + target_block_length - block_length); + block_length = target_block_length; + } + + /* Write the last block */ + bytes_written = (a->client_writer)(&a->archive, a->client_data, + state->compressed, block_length); + + /* TODO: Handle short write of final block. */ + if (bytes_written <= 0) + ret = ARCHIVE_FATAL; + else { + a->archive.raw_position += ret; + ret = ARCHIVE_OK; + } + + /* Cleanup: shut down compressor, release memory, etc. */ +cleanup: + switch (BZ2_bzCompressEnd(&(state->stream))) { + case BZ_OK: + break; + default: + archive_set_error(&a->archive, ARCHIVE_ERRNO_PROGRAMMER, + "Failed to clean up compressor"); + ret = ARCHIVE_FATAL; + } + + free(state->compressed); + free(state); + } + /* Free configuration data even if we were never fully initialized. */ + free(a->compressor.config); + a->compressor.config = NULL; + return (ret); +} + +/* + * Utility function to push input data through compressor, writing + * full output blocks as necessary. + * + * Note that this handles both the regular write case (finishing == + * false) and the end-of-archive case (finishing == true). + */ +static int +drive_compressor(struct archive_write *a, struct private_data *state, int finishing) +{ + ssize_t bytes_written; + int ret; + + for (;;) { + if (state->stream.avail_out == 0) { + bytes_written = (a->client_writer)(&a->archive, + a->client_data, state->compressed, + state->compressed_buffer_size); + if (bytes_written <= 0) { + /* TODO: Handle this write failure */ + return (ARCHIVE_FATAL); + } else if ((size_t)bytes_written < state->compressed_buffer_size) { + /* Short write: Move remainder to + * front and keep filling */ + memmove(state->compressed, + state->compressed + bytes_written, + state->compressed_buffer_size - bytes_written); + } + + a->archive.raw_position += bytes_written; + state->stream.next_out = state->compressed + + state->compressed_buffer_size - bytes_written; + state->stream.avail_out = bytes_written; + } + + /* If there's nothing to do, we're done. */ + if (!finishing && state->stream.avail_in == 0) + return (ARCHIVE_OK); + + ret = BZ2_bzCompress(&(state->stream), + finishing ? BZ_FINISH : BZ_RUN); + + switch (ret) { + case BZ_RUN_OK: + /* In non-finishing case, did compressor + * consume everything? */ + if (!finishing && state->stream.avail_in == 0) + return (ARCHIVE_OK); + break; + case BZ_FINISH_OK: /* Finishing: There's more work to do */ + break; + case BZ_STREAM_END: /* Finishing: all done */ + /* Only occurs in finishing case */ + return (ARCHIVE_OK); + default: + /* Any other return value indicates an error */ + archive_set_error(&a->archive, + ARCHIVE_ERRNO_PROGRAMMER, + "Bzip2 compression failed;" + " BZ2_bzCompress() returned %d", + ret); + return (ARCHIVE_FATAL); + } + } +} + +#endif /* HAVE_BZLIB_H */ diff --git a/lib/libarchive/archive_write_set_compression_compress.c b/lib/libarchive/archive_write_set_compression_compress.c new file mode 100644 index 000000000..77c73ee5f --- /dev/null +++ b/lib/libarchive/archive_write_set_compression_compress.c @@ -0,0 +1,492 @@ +/*- + * Copyright (c) 2008 Joerg Sonnenberger + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +/*- + * Copyright (c) 1985, 1986, 1992, 1993 + * The Regents of the University of California. All rights reserved. + * + * This code is derived from software contributed to Berkeley by + * Diomidis Spinellis and James A. Woods, derived from original + * work by Spencer Thomas and Joseph Orost. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * 3. Neither the name of the University nor the names of its contributors + * may be used to endorse or promote products derived from this software + * without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include "archive_platform.h" + +__FBSDID("$FreeBSD: head/lib/libarchive/archive_write_set_compression_compress.c 201111 2009-12-28 03:33:05Z kientzle $"); + +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif + +#include "archive.h" +#include "archive_private.h" +#include "archive_write_private.h" + +#define HSIZE 69001 /* 95% occupancy */ +#define HSHIFT 8 /* 8 - trunc(log2(HSIZE / 65536)) */ +#define CHECK_GAP 10000 /* Ratio check interval. */ + +#define MAXCODE(bits) ((1 << (bits)) - 1) + +/* + * the next two codes should not be changed lightly, as they must not + * lie within the contiguous general code space. + */ +#define FIRST 257 /* First free entry. */ +#define CLEAR 256 /* Table clear output code. */ + +struct private_data { + off_t in_count, out_count, checkpoint; + + int code_len; /* Number of bits/code. */ + int cur_maxcode; /* Maximum code, given n_bits. */ + int max_maxcode; /* Should NEVER generate this code. */ + int hashtab [HSIZE]; + unsigned short codetab [HSIZE]; + int first_free; /* First unused entry. */ + int compress_ratio; + + int cur_code, cur_fcode; + + int bit_offset; + unsigned char bit_buf; + + unsigned char *compressed; + size_t compressed_buffer_size; + size_t compressed_offset; +}; + +static int archive_compressor_compress_finish(struct archive_write *); +static int archive_compressor_compress_init(struct archive_write *); +static int archive_compressor_compress_write(struct archive_write *, + const void *, size_t); + +/* + * Allocate, initialize and return a archive object. + */ +int +archive_write_set_compression_compress(struct archive *_a) +{ + struct archive_write *a = (struct archive_write *)_a; + __archive_check_magic(&a->archive, ARCHIVE_WRITE_MAGIC, + ARCHIVE_STATE_NEW, "archive_write_set_compression_compress"); + a->compressor.init = &archive_compressor_compress_init; + a->archive.compression_code = ARCHIVE_COMPRESSION_COMPRESS; + a->archive.compression_name = "compress"; + return (ARCHIVE_OK); +} + +/* + * Setup callback. + */ +static int +archive_compressor_compress_init(struct archive_write *a) +{ + int ret; + struct private_data *state; + + a->archive.compression_code = ARCHIVE_COMPRESSION_COMPRESS; + a->archive.compression_name = "compress"; + + if (a->bytes_per_block < 4) { + archive_set_error(&a->archive, EINVAL, + "Can't write Compress header as single block"); + return (ARCHIVE_FATAL); + } + + if (a->client_opener != NULL) { + ret = (a->client_opener)(&a->archive, a->client_data); + if (ret != ARCHIVE_OK) + return (ret); + } + + state = (struct private_data *)malloc(sizeof(*state)); + if (state == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate data for compression"); + return (ARCHIVE_FATAL); + } + memset(state, 0, sizeof(*state)); + + state->compressed_buffer_size = a->bytes_per_block; + state->compressed = malloc(state->compressed_buffer_size); + + if (state->compressed == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate data for compression buffer"); + free(state); + return (ARCHIVE_FATAL); + } + + a->compressor.write = archive_compressor_compress_write; + a->compressor.finish = archive_compressor_compress_finish; + + state->max_maxcode = 0x10000; /* Should NEVER generate this code. */ + state->in_count = 0; /* Length of input. */ + state->bit_buf = 0; + state->bit_offset = 0; + state->out_count = 3; /* Includes 3-byte header mojo. */ + state->compress_ratio = 0; + state->checkpoint = CHECK_GAP; + state->code_len = 9; + state->cur_maxcode = MAXCODE(state->code_len); + state->first_free = FIRST; + + memset(state->hashtab, 0xff, sizeof(state->hashtab)); + + /* Prime output buffer with a gzip header. */ + state->compressed[0] = 0x1f; /* Compress */ + state->compressed[1] = 0x9d; + state->compressed[2] = 0x90; /* Block mode, 16bit max */ + state->compressed_offset = 3; + + a->compressor.data = state; + return (0); +} + +/*- + * Output the given code. + * Inputs: + * code: A n_bits-bit integer. If == -1, then EOF. This assumes + * that n_bits =< (long)wordsize - 1. + * Outputs: + * Outputs code to the file. + * Assumptions: + * Chars are 8 bits long. + * Algorithm: + * Maintain a BITS character long buffer (so that 8 codes will + * fit in it exactly). Use the VAX insv instruction to insert each + * code in turn. When the buffer fills up empty it and start over. + */ + +static unsigned char rmask[9] = + {0x00, 0x01, 0x03, 0x07, 0x0f, 0x1f, 0x3f, 0x7f, 0xff}; + +static int +output_byte(struct archive_write *a, unsigned char c) +{ + struct private_data *state = a->compressor.data; + ssize_t bytes_written; + + state->compressed[state->compressed_offset++] = c; + ++state->out_count; + + if (state->compressed_buffer_size == state->compressed_offset) { + bytes_written = (a->client_writer)(&a->archive, + a->client_data, + state->compressed, state->compressed_buffer_size); + if (bytes_written <= 0) + return ARCHIVE_FATAL; + a->archive.raw_position += bytes_written; + state->compressed_offset = 0; + } + + return ARCHIVE_OK; +} + +static int +output_code(struct archive_write *a, int ocode) +{ + struct private_data *state = a->compressor.data; + int bits, ret, clear_flg, bit_offset; + + clear_flg = ocode == CLEAR; + + /* + * Since ocode is always >= 8 bits, only need to mask the first + * hunk on the left. + */ + bit_offset = state->bit_offset % 8; + state->bit_buf |= (ocode << bit_offset) & 0xff; + output_byte(a, state->bit_buf); + + bits = state->code_len - (8 - bit_offset); + ocode >>= 8 - bit_offset; + /* Get any 8 bit parts in the middle (<=1 for up to 16 bits). */ + if (bits >= 8) { + output_byte(a, ocode & 0xff); + ocode >>= 8; + bits -= 8; + } + /* Last bits. */ + state->bit_offset += state->code_len; + state->bit_buf = ocode & rmask[bits]; + if (state->bit_offset == state->code_len * 8) + state->bit_offset = 0; + + /* + * If the next entry is going to be too big for the ocode size, + * then increase it, if possible. + */ + if (clear_flg || state->first_free > state->cur_maxcode) { + /* + * Write the whole buffer, because the input side won't + * discover the size increase until after it has read it. + */ + if (state->bit_offset > 0) { + while (state->bit_offset < state->code_len * 8) { + ret = output_byte(a, state->bit_buf); + if (ret != ARCHIVE_OK) + return ret; + state->bit_offset += 8; + state->bit_buf = 0; + } + } + state->bit_buf = 0; + state->bit_offset = 0; + + if (clear_flg) { + state->code_len = 9; + state->cur_maxcode = MAXCODE(state->code_len); + } else { + state->code_len++; + if (state->code_len == 16) + state->cur_maxcode = state->max_maxcode; + else + state->cur_maxcode = MAXCODE(state->code_len); + } + } + + return (ARCHIVE_OK); +} + +static int +output_flush(struct archive_write *a) +{ + struct private_data *state = a->compressor.data; + int ret; + + /* At EOF, write the rest of the buffer. */ + if (state->bit_offset % 8) { + state->code_len = (state->bit_offset % 8 + 7) / 8; + ret = output_byte(a, state->bit_buf); + if (ret != ARCHIVE_OK) + return ret; + } + + return (ARCHIVE_OK); +} + +/* + * Write data to the compressed stream. + */ +static int +archive_compressor_compress_write(struct archive_write *a, const void *buff, + size_t length) +{ + struct private_data *state; + int i; + int ratio; + int c, disp, ret; + const unsigned char *bp; + + state = (struct private_data *)a->compressor.data; + if (a->client_writer == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_PROGRAMMER, + "No write callback is registered? " + "This is probably an internal programming error."); + return (ARCHIVE_FATAL); + } + + if (length == 0) + return ARCHIVE_OK; + + bp = buff; + + if (state->in_count == 0) { + state->cur_code = *bp++; + ++state->in_count; + --length; + } + + while (length--) { + c = *bp++; + state->in_count++; + state->cur_fcode = (c << 16) + state->cur_code; + i = ((c << HSHIFT) ^ state->cur_code); /* Xor hashing. */ + + if (state->hashtab[i] == state->cur_fcode) { + state->cur_code = state->codetab[i]; + continue; + } + if (state->hashtab[i] < 0) /* Empty slot. */ + goto nomatch; + /* Secondary hash (after G. Knott). */ + if (i == 0) + disp = 1; + else + disp = HSIZE - i; + probe: + if ((i -= disp) < 0) + i += HSIZE; + + if (state->hashtab[i] == state->cur_fcode) { + state->cur_code = state->codetab[i]; + continue; + } + if (state->hashtab[i] >= 0) + goto probe; + nomatch: + ret = output_code(a, state->cur_code); + if (ret != ARCHIVE_OK) + return ret; + state->cur_code = c; + if (state->first_free < state->max_maxcode) { + state->codetab[i] = state->first_free++; /* code -> hashtable */ + state->hashtab[i] = state->cur_fcode; + continue; + } + if (state->in_count < state->checkpoint) + continue; + + state->checkpoint = state->in_count + CHECK_GAP; + + if (state->in_count <= 0x007fffff) + ratio = state->in_count * 256 / state->out_count; + else if ((ratio = state->out_count / 256) == 0) + ratio = 0x7fffffff; + else + ratio = state->in_count / ratio; + + if (ratio > state->compress_ratio) + state->compress_ratio = ratio; + else { + state->compress_ratio = 0; + memset(state->hashtab, 0xff, sizeof(state->hashtab)); + state->first_free = FIRST; + ret = output_code(a, CLEAR); + if (ret != ARCHIVE_OK) + return ret; + } + } + + return (ARCHIVE_OK); +} + + +/* + * Finish the compression... + */ +static int +archive_compressor_compress_finish(struct archive_write *a) +{ + ssize_t block_length, target_block_length, bytes_written; + int ret; + struct private_data *state; + size_t tocopy; + + state = (struct private_data *)a->compressor.data; + if (a->client_writer == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_PROGRAMMER, + "No write callback is registered? " + "This is probably an internal programming error."); + ret = ARCHIVE_FATAL; + goto cleanup; + } + + /* By default, always pad the uncompressed data. */ + if (a->pad_uncompressed) { + while (state->in_count % a->bytes_per_block != 0) { + tocopy = a->bytes_per_block - + (state->in_count % a->bytes_per_block); + if (tocopy > a->null_length) + tocopy = a->null_length; + ret = archive_compressor_compress_write(a, a->nulls, + tocopy); + if (ret != ARCHIVE_OK) + goto cleanup; + } + } + + ret = output_code(a, state->cur_code); + if (ret != ARCHIVE_OK) + goto cleanup; + ret = output_flush(a); + if (ret != ARCHIVE_OK) + goto cleanup; + + /* Optionally, pad the final compressed block. */ + block_length = state->compressed_offset; + + /* Tricky calculation to determine size of last block. */ + if (a->bytes_in_last_block <= 0) + /* Default or Zero: pad to full block */ + target_block_length = a->bytes_per_block; + else + /* Round length to next multiple of bytes_in_last_block. */ + target_block_length = a->bytes_in_last_block * + ( (block_length + a->bytes_in_last_block - 1) / + a->bytes_in_last_block); + if (target_block_length > a->bytes_per_block) + target_block_length = a->bytes_per_block; + if (block_length < target_block_length) { + memset(state->compressed + state->compressed_offset, 0, + target_block_length - block_length); + block_length = target_block_length; + } + + /* Write the last block */ + bytes_written = (a->client_writer)(&a->archive, a->client_data, + state->compressed, block_length); + if (bytes_written <= 0) + ret = ARCHIVE_FATAL; + else + a->archive.raw_position += bytes_written; + +cleanup: + free(state->compressed); + free(state); + return (ret); +} diff --git a/lib/libarchive/archive_write_set_compression_gzip.c b/lib/libarchive/archive_write_set_compression_gzip.c new file mode 100644 index 000000000..edc1a8e99 --- /dev/null +++ b/lib/libarchive/archive_write_set_compression_gzip.c @@ -0,0 +1,481 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" + +__FBSDID("$FreeBSD: head/lib/libarchive/archive_write_set_compression_gzip.c 201081 2009-12-28 02:04:42Z kientzle $"); + +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#include +#ifdef HAVE_ZLIB_H +#include +#endif + +#include "archive.h" +#include "archive_private.h" +#include "archive_write_private.h" + +#ifndef HAVE_ZLIB_H +int +archive_write_set_compression_gzip(struct archive *a) +{ + archive_set_error(a, ARCHIVE_ERRNO_MISC, + "gzip compression not supported on this platform"); + return (ARCHIVE_FATAL); +} +#else +/* Don't compile this if we don't have zlib. */ + +struct private_data { + z_stream stream; +#ifndef __minix + int64_t total_in; +#else + ssize_t total_in; +#endif + unsigned char *compressed; + size_t compressed_buffer_size; + unsigned long crc; +}; + +struct private_config { + int compression_level; +}; + + +/* + * Yuck. zlib.h is not const-correct, so I need this one bit + * of ugly hackery to convert a const * pointer to a non-const pointer. + */ +#define SET_NEXT_IN(st,src) \ + (st)->stream.next_in = (Bytef *)(uintptr_t)(const void *)(src) + +static int archive_compressor_gzip_finish(struct archive_write *); +static int archive_compressor_gzip_init(struct archive_write *); +static int archive_compressor_gzip_options(struct archive_write *, + const char *, const char *); +static int archive_compressor_gzip_write(struct archive_write *, + const void *, size_t); +static int drive_compressor(struct archive_write *, struct private_data *, + int finishing); + + +/* + * Allocate, initialize and return a archive object. + */ +int +archive_write_set_compression_gzip(struct archive *_a) +{ + struct archive_write *a = (struct archive_write *)_a; + struct private_config *config; + __archive_check_magic(&a->archive, ARCHIVE_WRITE_MAGIC, + ARCHIVE_STATE_NEW, "archive_write_set_compression_gzip"); + config = malloc(sizeof(*config)); + if (config == NULL) { + archive_set_error(&a->archive, ENOMEM, "Out of memory"); + return (ARCHIVE_FATAL); + } + a->compressor.config = config; + a->compressor.finish = &archive_compressor_gzip_finish; + config->compression_level = Z_DEFAULT_COMPRESSION; + a->compressor.init = &archive_compressor_gzip_init; + a->compressor.options = &archive_compressor_gzip_options; + a->archive.compression_code = ARCHIVE_COMPRESSION_GZIP; + a->archive.compression_name = "gzip"; + return (ARCHIVE_OK); +} + +/* + * Setup callback. + */ +static int +archive_compressor_gzip_init(struct archive_write *a) +{ + int ret; + struct private_data *state; + struct private_config *config; + time_t t; + + config = (struct private_config *)a->compressor.config; + + if (a->client_opener != NULL) { + ret = (a->client_opener)(&a->archive, a->client_data); + if (ret != ARCHIVE_OK) + return (ret); + } + + /* + * The next check is a temporary workaround until the gzip + * code can be overhauled some. The code should not require + * that compressed_buffer_size == bytes_per_block. Removing + * this assumption will allow us to compress larger chunks at + * a time, which should improve overall performance + * marginally. As a minor side-effect, such a cleanup would + * allow us to support truly arbitrary block sizes. + */ + if (a->bytes_per_block < 10) { + archive_set_error(&a->archive, EINVAL, + "GZip compressor requires a minimum 10 byte block size"); + return (ARCHIVE_FATAL); + } + + state = (struct private_data *)malloc(sizeof(*state)); + if (state == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate data for compression"); + return (ARCHIVE_FATAL); + } + memset(state, 0, sizeof(*state)); + + /* + * See comment above. We should set compressed_buffer_size to + * max(bytes_per_block, 65536), but the code can't handle that yet. + */ + state->compressed_buffer_size = a->bytes_per_block; + state->compressed = (unsigned char *)malloc(state->compressed_buffer_size); + state->crc = crc32(0L, NULL, 0); + + if (state->compressed == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate data for compression buffer"); + free(state); + return (ARCHIVE_FATAL); + } + + state->stream.next_out = state->compressed; + state->stream.avail_out = state->compressed_buffer_size; + + /* Prime output buffer with a gzip header. */ + t = time(NULL); + state->compressed[0] = 0x1f; /* GZip signature bytes */ + state->compressed[1] = 0x8b; + state->compressed[2] = 0x08; /* "Deflate" compression */ + state->compressed[3] = 0; /* No options */ + state->compressed[4] = (t)&0xff; /* Timestamp */ + state->compressed[5] = (t>>8)&0xff; + state->compressed[6] = (t>>16)&0xff; + state->compressed[7] = (t>>24)&0xff; + state->compressed[8] = 0; /* No deflate options */ + state->compressed[9] = 3; /* OS=Unix */ + state->stream.next_out += 10; + state->stream.avail_out -= 10; + + a->compressor.write = archive_compressor_gzip_write; + + /* Initialize compression library. */ + ret = deflateInit2(&(state->stream), + config->compression_level, + Z_DEFLATED, + -15 /* < 0 to suppress zlib header */, + 8, + Z_DEFAULT_STRATEGY); + + if (ret == Z_OK) { + a->compressor.data = state; + return (0); + } + + /* Library setup failed: clean up. */ + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, "Internal error " + "initializing compression library"); + free(state->compressed); + free(state); + + /* Override the error message if we know what really went wrong. */ + switch (ret) { + case Z_STREAM_ERROR: + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Internal error initializing " + "compression library: invalid setup parameter"); + break; + case Z_MEM_ERROR: + archive_set_error(&a->archive, ENOMEM, "Internal error initializing " + "compression library"); + break; + case Z_VERSION_ERROR: + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Internal error initializing " + "compression library: invalid library version"); + break; + } + + return (ARCHIVE_FATAL); +} + +/* + * Set write options. + */ +static int +archive_compressor_gzip_options(struct archive_write *a, const char *key, + const char *value) +{ + struct private_config *config; + + config = (struct private_config *)a->compressor.config; + if (strcmp(key, "compression-level") == 0) { + if (value == NULL || !(value[0] >= '0' && value[0] <= '9') || + value[1] != '\0') + return (ARCHIVE_WARN); + config->compression_level = value[0] - '0'; + return (ARCHIVE_OK); + } + + return (ARCHIVE_WARN); +} + +/* + * Write data to the compressed stream. + */ +static int +archive_compressor_gzip_write(struct archive_write *a, const void *buff, + size_t length) +{ + struct private_data *state; + int ret; + + state = (struct private_data *)a->compressor.data; + if (a->client_writer == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_PROGRAMMER, + "No write callback is registered? " + "This is probably an internal programming error."); + return (ARCHIVE_FATAL); + } + + /* Update statistics */ + state->crc = crc32(state->crc, (const Bytef *)buff, length); + state->total_in += length; + + /* Compress input data to output buffer */ + SET_NEXT_IN(state, buff); + state->stream.avail_in = length; + if ((ret = drive_compressor(a, state, 0)) != ARCHIVE_OK) + return (ret); + + a->archive.file_position += length; + return (ARCHIVE_OK); +} + +/* + * Finish the compression... + */ +static int +archive_compressor_gzip_finish(struct archive_write *a) +{ + ssize_t block_length, target_block_length, bytes_written; + int ret; + struct private_data *state; + unsigned tocopy; + unsigned char trailer[8]; + + state = (struct private_data *)a->compressor.data; + ret = 0; + if (state != NULL) { + if (a->client_writer == NULL) { + archive_set_error(&a->archive, + ARCHIVE_ERRNO_PROGRAMMER, + "No write callback is registered? " + "This is probably an internal programming error."); + ret = ARCHIVE_FATAL; + goto cleanup; + } + + /* By default, always pad the uncompressed data. */ + if (a->pad_uncompressed) { + tocopy = a->bytes_per_block - + (state->total_in % a->bytes_per_block); + while (tocopy > 0 && tocopy < (unsigned)a->bytes_per_block) { + SET_NEXT_IN(state, a->nulls); + state->stream.avail_in = tocopy < a->null_length ? + tocopy : a->null_length; + state->crc = crc32(state->crc, a->nulls, + state->stream.avail_in); + state->total_in += state->stream.avail_in; + tocopy -= state->stream.avail_in; + ret = drive_compressor(a, state, 0); + if (ret != ARCHIVE_OK) + goto cleanup; + } + } + + /* Finish compression cycle */ + if (((ret = drive_compressor(a, state, 1))) != ARCHIVE_OK) + goto cleanup; + + /* Build trailer: 4-byte CRC and 4-byte length. */ + trailer[0] = (state->crc)&0xff; + trailer[1] = (state->crc >> 8)&0xff; + trailer[2] = (state->crc >> 16)&0xff; + trailer[3] = (state->crc >> 24)&0xff; + trailer[4] = (state->total_in)&0xff; + trailer[5] = (state->total_in >> 8)&0xff; + trailer[6] = (state->total_in >> 16)&0xff; + trailer[7] = (state->total_in >> 24)&0xff; + + /* Add trailer to current block. */ + tocopy = 8; + if (tocopy > state->stream.avail_out) + tocopy = state->stream.avail_out; + memcpy(state->stream.next_out, trailer, tocopy); + state->stream.next_out += tocopy; + state->stream.avail_out -= tocopy; + + /* If it overflowed, flush and start a new block. */ + if (tocopy < 8) { + bytes_written = (a->client_writer)(&a->archive, a->client_data, + state->compressed, state->compressed_buffer_size); + if (bytes_written <= 0) { + ret = ARCHIVE_FATAL; + goto cleanup; + } + a->archive.raw_position += bytes_written; + state->stream.next_out = state->compressed; + state->stream.avail_out = state->compressed_buffer_size; + memcpy(state->stream.next_out, trailer + tocopy, 8-tocopy); + state->stream.next_out += 8-tocopy; + state->stream.avail_out -= 8-tocopy; + } + + /* Optionally, pad the final compressed block. */ + block_length = state->stream.next_out - state->compressed; + + /* Tricky calculation to determine size of last block. */ + if (a->bytes_in_last_block <= 0) + /* Default or Zero: pad to full block */ + target_block_length = a->bytes_per_block; + else + /* Round length to next multiple of bytes_in_last_block. */ + target_block_length = a->bytes_in_last_block * + ( (block_length + a->bytes_in_last_block - 1) / + a->bytes_in_last_block); + if (target_block_length > a->bytes_per_block) + target_block_length = a->bytes_per_block; + if (block_length < target_block_length) { + memset(state->stream.next_out, 0, + target_block_length - block_length); + block_length = target_block_length; + } + + /* Write the last block */ + bytes_written = (a->client_writer)(&a->archive, a->client_data, + state->compressed, block_length); + if (bytes_written <= 0) { + ret = ARCHIVE_FATAL; + goto cleanup; + } + a->archive.raw_position += bytes_written; + + /* Cleanup: shut down compressor, release memory, etc. */ + cleanup: + switch (deflateEnd(&(state->stream))) { + case Z_OK: + break; + default: + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Failed to clean up compressor"); + ret = ARCHIVE_FATAL; + } + free(state->compressed); + free(state); + } + /* Clean up config area even if we never initialized. */ + free(a->compressor.config); + a->compressor.config = NULL; + return (ret); +} + +/* + * Utility function to push input data through compressor, + * writing full output blocks as necessary. + * + * Note that this handles both the regular write case (finishing == + * false) and the end-of-archive case (finishing == true). + */ +static int +drive_compressor(struct archive_write *a, struct private_data *state, int finishing) +{ + ssize_t bytes_written; + int ret; + + for (;;) { + if (state->stream.avail_out == 0) { + bytes_written = (a->client_writer)(&a->archive, + a->client_data, state->compressed, + state->compressed_buffer_size); + if (bytes_written <= 0) { + /* TODO: Handle this write failure */ + return (ARCHIVE_FATAL); + } else if ((size_t)bytes_written < state->compressed_buffer_size) { + /* Short write: Move remaining to + * front of block and keep filling */ + memmove(state->compressed, + state->compressed + bytes_written, + state->compressed_buffer_size - bytes_written); + } + a->archive.raw_position += bytes_written; + state->stream.next_out + = state->compressed + + state->compressed_buffer_size - bytes_written; + state->stream.avail_out = bytes_written; + } + + /* If there's nothing to do, we're done. */ + if (!finishing && state->stream.avail_in == 0) + return (ARCHIVE_OK); + + ret = deflate(&(state->stream), + finishing ? Z_FINISH : Z_NO_FLUSH ); + + switch (ret) { + case Z_OK: + /* In non-finishing case, check if compressor + * consumed everything */ + if (!finishing && state->stream.avail_in == 0) + return (ARCHIVE_OK); + /* In finishing case, this return always means + * there's more work */ + break; + case Z_STREAM_END: + /* This return can only occur in finishing case. */ + return (ARCHIVE_OK); + default: + /* Any other return value indicates an error. */ + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "GZip compression failed:" + " deflate() call returned status %d", + ret); + return (ARCHIVE_FATAL); + } + } +} + +#endif /* HAVE_ZLIB_H */ diff --git a/lib/libarchive/archive_write_set_compression_none.c b/lib/libarchive/archive_write_set_compression_none.c new file mode 100644 index 000000000..e0216d9e1 --- /dev/null +++ b/lib/libarchive/archive_write_set_compression_none.c @@ -0,0 +1,257 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_write_set_compression_none.c 201080 2009-12-28 02:03:54Z kientzle $"); + +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif + +#include "archive.h" +#include "archive_private.h" +#include "archive_write_private.h" + +static int archive_compressor_none_finish(struct archive_write *a); +static int archive_compressor_none_init(struct archive_write *); +static int archive_compressor_none_write(struct archive_write *, + const void *, size_t); + +struct archive_none { + char *buffer; + ssize_t buffer_size; + char *next; /* Current insert location */ + ssize_t avail; /* Free space left in buffer */ +}; + +int +archive_write_set_compression_none(struct archive *_a) +{ + struct archive_write *a = (struct archive_write *)_a; + __archive_check_magic(&a->archive, ARCHIVE_WRITE_MAGIC, + ARCHIVE_STATE_NEW, "archive_write_set_compression_none"); + a->compressor.init = &archive_compressor_none_init; + return (0); +} + +/* + * Setup callback. + */ +static int +archive_compressor_none_init(struct archive_write *a) +{ + int ret; + struct archive_none *state; + + a->archive.compression_code = ARCHIVE_COMPRESSION_NONE; + a->archive.compression_name = "none"; + + if (a->client_opener != NULL) { + ret = (a->client_opener)(&a->archive, a->client_data); + if (ret != 0) + return (ret); + } + + state = (struct archive_none *)malloc(sizeof(*state)); + if (state == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate data for output buffering"); + return (ARCHIVE_FATAL); + } + memset(state, 0, sizeof(*state)); + + state->buffer_size = a->bytes_per_block; + if (state->buffer_size != 0) { + state->buffer = (char *)malloc(state->buffer_size); + if (state->buffer == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate output buffer"); + free(state); + return (ARCHIVE_FATAL); + } + } + + state->next = state->buffer; + state->avail = state->buffer_size; + + a->compressor.data = state; + a->compressor.write = archive_compressor_none_write; + a->compressor.finish = archive_compressor_none_finish; + return (ARCHIVE_OK); +} + +/* + * Write data to the stream. + */ +static int +archive_compressor_none_write(struct archive_write *a, const void *vbuff, + size_t length) +{ + const char *buff; + ssize_t remaining, to_copy; + ssize_t bytes_written; + struct archive_none *state; + + state = (struct archive_none *)a->compressor.data; + buff = (const char *)vbuff; + if (a->client_writer == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_PROGRAMMER, + "No write callback is registered? " + "This is probably an internal programming error."); + return (ARCHIVE_FATAL); + } + + remaining = length; + + /* + * If there is no buffer for blocking, just pass the data + * straight through to the client write callback. In + * particular, this supports "no write delay" operation for + * special applications. Just set the block size to zero. + */ + if (state->buffer_size == 0) { + while (remaining > 0) { + bytes_written = (a->client_writer)(&a->archive, + a->client_data, buff, remaining); + if (bytes_written <= 0) + return (ARCHIVE_FATAL); + a->archive.raw_position += bytes_written; + remaining -= bytes_written; + buff += bytes_written; + } + a->archive.file_position += length; + return (ARCHIVE_OK); + } + + /* If the copy buffer isn't empty, try to fill it. */ + if (state->avail < state->buffer_size) { + /* If buffer is not empty... */ + /* ... copy data into buffer ... */ + to_copy = (remaining > state->avail) ? + state->avail : remaining; + memcpy(state->next, buff, to_copy); + state->next += to_copy; + state->avail -= to_copy; + buff += to_copy; + remaining -= to_copy; + /* ... if it's full, write it out. */ + if (state->avail == 0) { + bytes_written = (a->client_writer)(&a->archive, + a->client_data, state->buffer, state->buffer_size); + if (bytes_written <= 0) + return (ARCHIVE_FATAL); + /* XXX TODO: if bytes_written < state->buffer_size */ + a->archive.raw_position += bytes_written; + state->next = state->buffer; + state->avail = state->buffer_size; + } + } + + while (remaining > state->buffer_size) { + /* Write out full blocks directly to client. */ + bytes_written = (a->client_writer)(&a->archive, + a->client_data, buff, state->buffer_size); + if (bytes_written <= 0) + return (ARCHIVE_FATAL); + a->archive.raw_position += bytes_written; + buff += bytes_written; + remaining -= bytes_written; + } + + if (remaining > 0) { + /* Copy last bit into copy buffer. */ + memcpy(state->next, buff, remaining); + state->next += remaining; + state->avail -= remaining; + } + + a->archive.file_position += length; + return (ARCHIVE_OK); +} + + +/* + * Finish the compression. + */ +static int +archive_compressor_none_finish(struct archive_write *a) +{ + ssize_t block_length; + ssize_t target_block_length; + ssize_t bytes_written; + int ret; + struct archive_none *state; + + state = (struct archive_none *)a->compressor.data; + ret = ARCHIVE_OK; + if (a->client_writer == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_PROGRAMMER, + "No write callback is registered? " + "This is probably an internal programming error."); + return (ARCHIVE_FATAL); + } + + /* If there's pending data, pad and write the last block */ + if (state->next != state->buffer) { + block_length = state->buffer_size - state->avail; + + /* Tricky calculation to determine size of last block */ + if (a->bytes_in_last_block <= 0) + /* Default or Zero: pad to full block */ + target_block_length = a->bytes_per_block; + else + /* Round to next multiple of bytes_in_last_block. */ + target_block_length = a->bytes_in_last_block * + ( (block_length + a->bytes_in_last_block - 1) / + a->bytes_in_last_block); + if (target_block_length > a->bytes_per_block) + target_block_length = a->bytes_per_block; + if (block_length < target_block_length) { + memset(state->next, 0, + target_block_length - block_length); + block_length = target_block_length; + } + bytes_written = (a->client_writer)(&a->archive, + a->client_data, state->buffer, block_length); + if (bytes_written <= 0) + ret = ARCHIVE_FATAL; + else { + a->archive.raw_position += bytes_written; + ret = ARCHIVE_OK; + } + } + if (state->buffer) + free(state->buffer); + free(state); + a->compressor.data = NULL; + + return (ret); +} diff --git a/lib/libarchive/archive_write_set_compression_program.c b/lib/libarchive/archive_write_set_compression_program.c new file mode 100644 index 000000000..475ba3540 --- /dev/null +++ b/lib/libarchive/archive_write_set_compression_program.c @@ -0,0 +1,347 @@ +/*- + * Copyright (c) 2007 Joerg Sonnenberger + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" + +__FBSDID("$FreeBSD: head/lib/libarchive/archive_write_set_compression_program.c 201104 2009-12-28 03:14:30Z kientzle $"); + +/* This capability is only available on POSIX systems. */ +#if (!defined(HAVE_PIPE) || !defined(HAVE_FCNTL) || \ + !(defined(HAVE_FORK) || defined(HAVE_VFORK))) && (!defined(_WIN32) || defined(__CYGWIN__)) +#include "archive.h" + +/* + * On non-Posix systems, allow the program to build, but choke if + * this function is actually invoked. + */ +int +archive_write_set_compression_program(struct archive *_a, const char *cmd) +{ + archive_set_error(_a, -1, + "External compression programs not supported on this platform"); + return (ARCHIVE_FATAL); +} + +#else + +#ifdef HAVE_SYS_WAIT_H +# include +#endif +#ifdef HAVE_ERRNO_H +# include +#endif +#ifdef HAVE_FCNTL_H +# include +#endif +#ifdef HAVE_STDLIB_H +# include +#endif +#ifdef HAVE_STRING_H +# include +#endif + +#include "archive.h" +#include "archive_private.h" +#include "archive_write_private.h" + +#include "filter_fork.h" + +struct private_data { + char *description; + pid_t child; + int child_stdin, child_stdout; + + char *child_buf; + size_t child_buf_len, child_buf_avail; +}; + +static int archive_compressor_program_finish(struct archive_write *); +static int archive_compressor_program_init(struct archive_write *); +static int archive_compressor_program_write(struct archive_write *, + const void *, size_t); + +/* + * Allocate, initialize and return a archive object. + */ +int +archive_write_set_compression_program(struct archive *_a, const char *cmd) +{ + struct archive_write *a = (struct archive_write *)_a; + __archive_check_magic(&a->archive, ARCHIVE_WRITE_MAGIC, + ARCHIVE_STATE_NEW, "archive_write_set_compression_program"); + a->compressor.init = &archive_compressor_program_init; + a->compressor.config = strdup(cmd); + return (ARCHIVE_OK); +} + +/* + * Setup callback. + */ +static int +archive_compressor_program_init(struct archive_write *a) +{ + int ret; + struct private_data *state; + static const char *prefix = "Program: "; + char *cmd = a->compressor.config; + + if (a->client_opener != NULL) { + ret = (a->client_opener)(&a->archive, a->client_data); + if (ret != ARCHIVE_OK) + return (ret); + } + + state = (struct private_data *)malloc(sizeof(*state)); + if (state == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate data for compression"); + return (ARCHIVE_FATAL); + } + memset(state, 0, sizeof(*state)); + + a->archive.compression_code = ARCHIVE_COMPRESSION_PROGRAM; + state->description = (char *)malloc(strlen(prefix) + strlen(cmd) + 1); + strcpy(state->description, prefix); + strcat(state->description, cmd); + a->archive.compression_name = state->description; + + state->child_buf_len = a->bytes_per_block; + state->child_buf_avail = 0; + state->child_buf = malloc(state->child_buf_len); + + if (state->child_buf == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate data for compression buffer"); + free(state); + return (ARCHIVE_FATAL); + } + + if ((state->child = __archive_create_child(cmd, + &state->child_stdin, &state->child_stdout)) == -1) { + archive_set_error(&a->archive, EINVAL, + "Can't initialise filter"); + free(state->child_buf); + free(state); + return (ARCHIVE_FATAL); + } + + a->compressor.write = archive_compressor_program_write; + a->compressor.finish = archive_compressor_program_finish; + + a->compressor.data = state; + return (0); +} + +static ssize_t +child_write(struct archive_write *a, const char *buf, size_t buf_len) +{ + struct private_data *state = a->compressor.data; + ssize_t ret; + + if (state->child_stdin == -1) + return (-1); + + if (buf_len == 0) + return (-1); + +restart_write: + do { + ret = write(state->child_stdin, buf, buf_len); + } while (ret == -1 && errno == EINTR); + + if (ret > 0) + return (ret); + if (ret == 0) { + close(state->child_stdin); + state->child_stdin = -1; + fcntl(state->child_stdout, F_SETFL, 0); + return (0); + } + if (ret == -1 && errno != EAGAIN) + return (-1); + + if (state->child_stdout == -1) { + fcntl(state->child_stdin, F_SETFL, 0); + __archive_check_child(state->child_stdin, state->child_stdout); + goto restart_write; + } + + do { + ret = read(state->child_stdout, + state->child_buf + state->child_buf_avail, + state->child_buf_len - state->child_buf_avail); + } while (ret == -1 && errno == EINTR); + + if (ret == 0 || (ret == -1 && errno == EPIPE)) { + close(state->child_stdout); + state->child_stdout = -1; + fcntl(state->child_stdin, F_SETFL, 0); + goto restart_write; + } + if (ret == -1 && errno == EAGAIN) { + __archive_check_child(state->child_stdin, state->child_stdout); + goto restart_write; + } + if (ret == -1) + return (-1); + + state->child_buf_avail += ret; + + ret = (a->client_writer)(&a->archive, a->client_data, + state->child_buf, state->child_buf_avail); + if (ret <= 0) + return (-1); + + if ((size_t)ret < state->child_buf_avail) { + memmove(state->child_buf, state->child_buf + ret, + state->child_buf_avail - ret); + } + state->child_buf_avail -= ret; + a->archive.raw_position += ret; + goto restart_write; +} + +/* + * Write data to the compressed stream. + */ +static int +archive_compressor_program_write(struct archive_write *a, const void *buff, + size_t length) +{ + ssize_t ret; + const char *buf; + + if (a->client_writer == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_PROGRAMMER, + "No write callback is registered? " + "This is probably an internal programming error."); + return (ARCHIVE_FATAL); + } + + buf = buff; + while (length > 0) { + ret = child_write(a, buf, length); + if (ret == -1 || ret == 0) { + archive_set_error(&a->archive, EIO, + "Can't write to filter"); + return (ARCHIVE_FATAL); + } + length -= ret; + buf += ret; + } + + a->archive.file_position += length; + return (ARCHIVE_OK); +} + + +/* + * Finish the compression... + */ +static int +archive_compressor_program_finish(struct archive_write *a) +{ + int ret, status; + ssize_t bytes_read, bytes_written; + struct private_data *state; + + state = (struct private_data *)a->compressor.data; + ret = 0; + if (a->client_writer == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_PROGRAMMER, + "No write callback is registered? " + "This is probably an internal programming error."); + ret = ARCHIVE_FATAL; + goto cleanup; + } + + /* XXX pad compressed data. */ + + close(state->child_stdin); + state->child_stdin = -1; + fcntl(state->child_stdout, F_SETFL, 0); + + for (;;) { + do { + bytes_read = read(state->child_stdout, + state->child_buf + state->child_buf_avail, + state->child_buf_len - state->child_buf_avail); + } while (bytes_read == -1 && errno == EINTR); + + if (bytes_read == 0 || (bytes_read == -1 && errno == EPIPE)) + break; + + if (bytes_read == -1) { + archive_set_error(&a->archive, errno, + "Read from filter failed unexpectedly."); + ret = ARCHIVE_FATAL; + goto cleanup; + } + state->child_buf_avail += bytes_read; + + bytes_written = (a->client_writer)(&a->archive, a->client_data, + state->child_buf, state->child_buf_avail); + if (bytes_written <= 0) { + ret = ARCHIVE_FATAL; + goto cleanup; + } + if ((size_t)bytes_written < state->child_buf_avail) { + memmove(state->child_buf, + state->child_buf + bytes_written, + state->child_buf_avail - bytes_written); + } + state->child_buf_avail -= bytes_written; + a->archive.raw_position += bytes_written; + } + + /* XXX pad final compressed block. */ + +cleanup: + /* Shut down the child. */ + if (state->child_stdin != -1) + close(state->child_stdin); + if (state->child_stdout != -1) + close(state->child_stdout); + while (waitpid(state->child, &status, 0) == -1 && errno == EINTR) + continue; + + if (status != 0) { + archive_set_error(&a->archive, EIO, + "Filter exited with failure."); + ret = ARCHIVE_FATAL; + } + + /* Release our configuration data. */ + free(a->compressor.config); + a->compressor.config = NULL; + + /* Release our private state data. */ + free(state->child_buf); + free(state->description); + free(state); + return (ret); +} + +#endif /* !defined(HAVE_PIPE) || !defined(HAVE_VFORK) || !defined(HAVE_FCNTL) */ diff --git a/lib/libarchive/archive_write_set_compression_xz.c b/lib/libarchive/archive_write_set_compression_xz.c new file mode 100644 index 000000000..f82f6db62 --- /dev/null +++ b/lib/libarchive/archive_write_set_compression_xz.c @@ -0,0 +1,438 @@ +/*- + * Copyright (c) 2009 Michihiro NAKAJIMA + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" + +__FBSDID("$FreeBSD: head/lib/libarchive/archive_write_set_compression_xz.c 201108 2009-12-28 03:28:21Z kientzle $"); + +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#include +#ifdef HAVE_LZMA_H +#include +#endif + +#include "archive.h" +#include "archive_private.h" +#include "archive_write_private.h" + +#ifndef HAVE_LZMA_H +int +archive_write_set_compression_xz(struct archive *a) +{ + archive_set_error(a, ARCHIVE_ERRNO_MISC, + "xz compression not supported on this platform"); + return (ARCHIVE_FATAL); +} + +int +archive_write_set_compression_lzma(struct archive *a) +{ + archive_set_error(a, ARCHIVE_ERRNO_MISC, + "lzma compression not supported on this platform"); + return (ARCHIVE_FATAL); +} +#else +/* Don't compile this if we don't have liblzma. */ + +struct private_data { + lzma_stream stream; + lzma_filter lzmafilters[2]; + lzma_options_lzma lzma_opt; + int64_t total_in; + unsigned char *compressed; + size_t compressed_buffer_size; +}; + +struct private_config { + int compression_level; +}; + +static int archive_compressor_xz_init(struct archive_write *); +static int archive_compressor_xz_options(struct archive_write *, + const char *, const char *); +static int archive_compressor_xz_finish(struct archive_write *); +static int archive_compressor_xz_write(struct archive_write *, + const void *, size_t); +static int drive_compressor(struct archive_write *, struct private_data *, + int finishing); + + +/* + * Allocate, initialize and return a archive object. + */ +int +archive_write_set_compression_xz(struct archive *_a) +{ + struct private_config *config; + struct archive_write *a = (struct archive_write *)_a; + __archive_check_magic(&a->archive, ARCHIVE_WRITE_MAGIC, + ARCHIVE_STATE_NEW, "archive_write_set_compression_xz"); + config = calloc(1, sizeof(*config)); + if (config == NULL) { + archive_set_error(&a->archive, ENOMEM, "Out of memory"); + return (ARCHIVE_FATAL); + } + a->compressor.config = config; + a->compressor.finish = archive_compressor_xz_finish; + config->compression_level = LZMA_PRESET_DEFAULT; + a->compressor.init = &archive_compressor_xz_init; + a->compressor.options = &archive_compressor_xz_options; + a->archive.compression_code = ARCHIVE_COMPRESSION_XZ; + a->archive.compression_name = "xz"; + return (ARCHIVE_OK); +} + +/* LZMA is handled identically, we just need a different compression + * code set. (The liblzma setup looks at the code to determine + * the one place that XZ and LZMA require different handling.) */ +int +archive_write_set_compression_lzma(struct archive *_a) +{ + struct archive_write *a = (struct archive_write *)_a; + int r = archive_write_set_compression_xz(_a); + if (r != ARCHIVE_OK) + return (r); + a->archive.compression_code = ARCHIVE_COMPRESSION_LZMA; + a->archive.compression_name = "lzma"; + return (ARCHIVE_OK); +} + +static int +archive_compressor_xz_init_stream(struct archive_write *a, + struct private_data *state) +{ + int ret; + + state->stream = (lzma_stream)LZMA_STREAM_INIT; + state->stream.next_out = state->compressed; + state->stream.avail_out = state->compressed_buffer_size; + if (a->archive.compression_code == ARCHIVE_COMPRESSION_XZ) + ret = lzma_stream_encoder(&(state->stream), + state->lzmafilters, LZMA_CHECK_CRC64); + else + ret = lzma_alone_encoder(&(state->stream), &state->lzma_opt); + if (ret == LZMA_OK) + return (ARCHIVE_OK); + + switch (ret) { + case LZMA_MEM_ERROR: + archive_set_error(&a->archive, ENOMEM, + "Internal error initializing compression library: " + "Cannot allocate memory"); + break; + default: + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Internal error initializing compression library: " + "It's a bug in liblzma"); + break; + } + return (ARCHIVE_FATAL); +} + +/* + * Setup callback. + */ +static int +archive_compressor_xz_init(struct archive_write *a) +{ + int ret; + struct private_data *state; + struct private_config *config; + + if (a->client_opener != NULL) { + ret = (a->client_opener)(&a->archive, a->client_data); + if (ret != ARCHIVE_OK) + return (ret); + } + + state = (struct private_data *)malloc(sizeof(*state)); + if (state == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate data for compression"); + return (ARCHIVE_FATAL); + } + memset(state, 0, sizeof(*state)); + config = a->compressor.config; + + /* + * See comment above. We should set compressed_buffer_size to + * max(bytes_per_block, 65536), but the code can't handle that yet. + */ + state->compressed_buffer_size = a->bytes_per_block; + state->compressed = (unsigned char *)malloc(state->compressed_buffer_size); + if (state->compressed == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate data for compression buffer"); + free(state); + return (ARCHIVE_FATAL); + } + a->compressor.write = archive_compressor_xz_write; + + /* Initialize compression library. */ + if (lzma_lzma_preset(&state->lzma_opt, config->compression_level)) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Internal error initializing compression library"); + free(state->compressed); + free(state); + } + state->lzmafilters[0].id = LZMA_FILTER_LZMA2; + state->lzmafilters[0].options = &state->lzma_opt; + state->lzmafilters[1].id = LZMA_VLI_UNKNOWN;/* Terminate */ + ret = archive_compressor_xz_init_stream(a, state); + if (ret == LZMA_OK) { + a->compressor.data = state; + return (0); + } + /* Library setup failed: clean up. */ + free(state->compressed); + free(state); + + return (ARCHIVE_FATAL); +} + +/* + * Set write options. + */ +static int +archive_compressor_xz_options(struct archive_write *a, const char *key, + const char *value) +{ + struct private_config *config; + + config = (struct private_config *)a->compressor.config; + if (strcmp(key, "compression-level") == 0) { + if (value == NULL || !(value[0] >= '0' && value[0] <= '9') || + value[1] != '\0') + return (ARCHIVE_WARN); + config->compression_level = value[0] - '0'; + if (config->compression_level > 6) + config->compression_level = 6; + return (ARCHIVE_OK); + } + + return (ARCHIVE_WARN); +} + +/* + * Write data to the compressed stream. + */ +static int +archive_compressor_xz_write(struct archive_write *a, const void *buff, + size_t length) +{ + struct private_data *state; + int ret; + + state = (struct private_data *)a->compressor.data; + if (a->client_writer == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_PROGRAMMER, + "No write callback is registered? " + "This is probably an internal programming error."); + return (ARCHIVE_FATAL); + } + + /* Update statistics */ + state->total_in += length; + + /* Compress input data to output buffer */ + state->stream.next_in = buff; + state->stream.avail_in = length; + if ((ret = drive_compressor(a, state, 0)) != ARCHIVE_OK) + return (ret); + + a->archive.file_position += length; + return (ARCHIVE_OK); +} + + +/* + * Finish the compression... + */ +static int +archive_compressor_xz_finish(struct archive_write *a) +{ + ssize_t block_length, target_block_length, bytes_written; + int ret; + struct private_data *state; + unsigned tocopy; + + ret = ARCHIVE_OK; + state = (struct private_data *)a->compressor.data; + if (state != NULL) { + if (a->client_writer == NULL) { + archive_set_error(&a->archive, + ARCHIVE_ERRNO_PROGRAMMER, + "No write callback is registered? " + "This is probably an internal programming error."); + ret = ARCHIVE_FATAL; + goto cleanup; + } + + /* By default, always pad the uncompressed data. */ + if (a->pad_uncompressed) { + tocopy = a->bytes_per_block - + (state->total_in % a->bytes_per_block); + while (tocopy > 0 && tocopy < (unsigned)a->bytes_per_block) { + state->stream.next_in = a->nulls; + state->stream.avail_in = tocopy < a->null_length ? + tocopy : a->null_length; + state->total_in += state->stream.avail_in; + tocopy -= state->stream.avail_in; + ret = drive_compressor(a, state, 0); + if (ret != ARCHIVE_OK) + goto cleanup; + } + } + + /* Finish compression cycle */ + if (((ret = drive_compressor(a, state, 1))) != ARCHIVE_OK) + goto cleanup; + + /* Optionally, pad the final compressed block. */ + block_length = state->stream.next_out - state->compressed; + + /* Tricky calculation to determine size of last block. */ + if (a->bytes_in_last_block <= 0) + /* Default or Zero: pad to full block */ + target_block_length = a->bytes_per_block; + else + /* Round length to next multiple of bytes_in_last_block. */ + target_block_length = a->bytes_in_last_block * + ( (block_length + a->bytes_in_last_block - 1) / + a->bytes_in_last_block); + if (target_block_length > a->bytes_per_block) + target_block_length = a->bytes_per_block; + if (block_length < target_block_length) { + memset(state->stream.next_out, 0, + target_block_length - block_length); + block_length = target_block_length; + } + + /* Write the last block */ + bytes_written = (a->client_writer)(&a->archive, a->client_data, + state->compressed, block_length); + if (bytes_written <= 0) { + ret = ARCHIVE_FATAL; + goto cleanup; + } + a->archive.raw_position += bytes_written; + + /* Cleanup: shut down compressor, release memory, etc. */ + cleanup: + lzma_end(&(state->stream)); + free(state->compressed); + free(state); + } + free(a->compressor.config); + a->compressor.config = NULL; + return (ret); +} + +/* + * Utility function to push input data through compressor, + * writing full output blocks as necessary. + * + * Note that this handles both the regular write case (finishing == + * false) and the end-of-archive case (finishing == true). + */ +static int +drive_compressor(struct archive_write *a, struct private_data *state, int finishing) +{ + ssize_t bytes_written; + int ret; + + for (;;) { + if (state->stream.avail_out == 0) { + bytes_written = (a->client_writer)(&a->archive, + a->client_data, state->compressed, + state->compressed_buffer_size); + if (bytes_written <= 0) { + /* TODO: Handle this write failure */ + return (ARCHIVE_FATAL); + } else if ((size_t)bytes_written < state->compressed_buffer_size) { + /* Short write: Move remaining to + * front of block and keep filling */ + memmove(state->compressed, + state->compressed + bytes_written, + state->compressed_buffer_size - bytes_written); + } + a->archive.raw_position += bytes_written; + state->stream.next_out + = state->compressed + + state->compressed_buffer_size - bytes_written; + state->stream.avail_out = bytes_written; + } + + /* If there's nothing to do, we're done. */ + if (!finishing && state->stream.avail_in == 0) + return (ARCHIVE_OK); + + ret = lzma_code(&(state->stream), + finishing ? LZMA_FINISH : LZMA_RUN ); + + switch (ret) { + case LZMA_OK: + /* In non-finishing case, check if compressor + * consumed everything */ + if (!finishing && state->stream.avail_in == 0) + return (ARCHIVE_OK); + /* In finishing case, this return always means + * there's more work */ + break; + case LZMA_STREAM_END: + /* This return can only occur in finishing case. */ + if (finishing) + return (ARCHIVE_OK); + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "lzma compression data error"); + return (ARCHIVE_FATAL); + case LZMA_MEMLIMIT_ERROR: + archive_set_error(&a->archive, ENOMEM, + "lzma compression error: " + "%ju MiB would have been needed", + (lzma_memusage(&(state->stream)) + 1024 * 1024 -1) + / (1024 * 1024)); + return (ARCHIVE_FATAL); + default: + /* Any other return value indicates an error. */ + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "lzma compression failed:" + " lzma_code() call returned status %d", + ret); + return (ARCHIVE_FATAL); + } + } +} + +#endif /* HAVE_LZMA_H */ diff --git a/lib/libarchive/archive_write_set_format.c b/lib/libarchive/archive_write_set_format.c new file mode 100644 index 000000000..848064856 --- /dev/null +++ b/lib/libarchive/archive_write_set_format.c @@ -0,0 +1,74 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_write_set_format.c 201168 2009-12-29 06:15:32Z kientzle $"); + +#ifdef HAVE_SYS_TYPES_H +#include +#endif + +#ifdef HAVE_ERRNO_H +#include +#endif + +#include "archive.h" +#include "archive_private.h" + +/* A table that maps format codes to functions. */ +static +struct { int code; int (*setter)(struct archive *); } codes[] = +{ +#ifndef __minix + { ARCHIVE_FORMAT_CPIO, archive_write_set_format_cpio }, + { ARCHIVE_FORMAT_CPIO_SVR4_NOCRC, archive_write_set_format_cpio_newc }, + { ARCHIVE_FORMAT_CPIO_POSIX, archive_write_set_format_cpio }, +#endif + { ARCHIVE_FORMAT_MTREE, archive_write_set_format_mtree }, + { ARCHIVE_FORMAT_SHAR, archive_write_set_format_shar }, + { ARCHIVE_FORMAT_SHAR_BASE, archive_write_set_format_shar }, + { ARCHIVE_FORMAT_SHAR_DUMP, archive_write_set_format_shar_dump }, + { ARCHIVE_FORMAT_TAR, archive_write_set_format_pax_restricted }, + { ARCHIVE_FORMAT_TAR_PAX_INTERCHANGE, archive_write_set_format_pax }, + { ARCHIVE_FORMAT_TAR_PAX_RESTRICTED, + archive_write_set_format_pax_restricted }, + { ARCHIVE_FORMAT_TAR_USTAR, archive_write_set_format_ustar }, + { ARCHIVE_FORMAT_ZIP, archive_write_set_format_zip }, + { 0, NULL } +}; + +int +archive_write_set_format(struct archive *a, int code) +{ + int i; + + for (i = 0; codes[i].code != 0; i++) { + if (code == codes[i].code) + return ((codes[i].setter)(a)); + } + + archive_set_error(a, EINVAL, "No such format"); + return (ARCHIVE_FATAL); +} diff --git a/lib/libarchive/archive_write_set_format_ar.c b/lib/libarchive/archive_write_set_format_ar.c new file mode 100644 index 000000000..54d297b0e --- /dev/null +++ b/lib/libarchive/archive_write_set_format_ar.c @@ -0,0 +1,643 @@ +/*- + * Copyright (c) 2007 Kai Wang + * Copyright (c) 2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer + * in this position and unchanged. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_write_set_format_ar.c 201108 2009-12-28 03:28:21Z kientzle $"); + +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" +#include "archive_write_private.h" + +#ifndef __minix +struct ar_w { + uint64_t entry_bytes_remaining; + uint64_t entry_padding; + int is_strtab; + int has_strtab; + char *strtab; +}; +#else +struct ar_w { + size_t entry_bytes_remaining; + size_t entry_padding; + int is_strtab; + int has_strtab; + char *strtab; +}; +#endif +/* + * Define structure of the "ar" header. + */ +#define AR_name_offset 0 +#define AR_name_size 16 +#define AR_date_offset 16 +#define AR_date_size 12 +#define AR_uid_offset 28 +#define AR_uid_size 6 +#define AR_gid_offset 34 +#define AR_gid_size 6 +#define AR_mode_offset 40 +#define AR_mode_size 8 +#define AR_size_offset 48 +#define AR_size_size 10 +#define AR_fmag_offset 58 +#define AR_fmag_size 2 + +static int archive_write_set_format_ar(struct archive_write *); +static int archive_write_ar_header(struct archive_write *, + struct archive_entry *); +static ssize_t archive_write_ar_data(struct archive_write *, + const void *buff, size_t s); +static int archive_write_ar_destroy(struct archive_write *); +static int archive_write_ar_finish(struct archive_write *); +static int archive_write_ar_finish_entry(struct archive_write *); +static const char *ar_basename(const char *path); +#ifndef __minix +static int format_octal(int64_t v, char *p, int s); +static int format_decimal(int64_t v, char *p, int s); +#else +static int format_octal(int32_t v, char *p, int s); +static int format_decimal(int32_t v, char *p, int s); +#endif + +int +archive_write_set_format_ar_bsd(struct archive *_a) +{ + struct archive_write *a = (struct archive_write *)_a; + int r = archive_write_set_format_ar(a); + if (r == ARCHIVE_OK) { + a->archive.archive_format = ARCHIVE_FORMAT_AR_BSD; + a->archive.archive_format_name = "ar (BSD)"; + } + return (r); +} + +int +archive_write_set_format_ar_svr4(struct archive *_a) +{ + struct archive_write *a = (struct archive_write *)_a; + int r = archive_write_set_format_ar(a); + if (r == ARCHIVE_OK) { + a->archive.archive_format = ARCHIVE_FORMAT_AR_GNU; + a->archive.archive_format_name = "ar (GNU/SVR4)"; + } + return (r); +} + +/* + * Generic initialization. + */ +static int +archive_write_set_format_ar(struct archive_write *a) +{ + struct ar_w *ar; + + /* If someone else was already registered, unregister them. */ + if (a->format_destroy != NULL) + (a->format_destroy)(a); + + ar = (struct ar_w *)malloc(sizeof(*ar)); + if (ar == NULL) { + archive_set_error(&a->archive, ENOMEM, "Can't allocate ar data"); + return (ARCHIVE_FATAL); + } + memset(ar, 0, sizeof(*ar)); + a->format_data = ar; + + a->format_name = "ar"; + a->format_write_header = archive_write_ar_header; + a->format_write_data = archive_write_ar_data; + a->format_finish = archive_write_ar_finish; + a->format_destroy = archive_write_ar_destroy; + a->format_finish_entry = archive_write_ar_finish_entry; + return (ARCHIVE_OK); +} + +static int +archive_write_ar_header(struct archive_write *a, struct archive_entry *entry) +{ + int ret, append_fn; + char buff[60]; + char *ss, *se; + struct ar_w *ar; + const char *pathname; + const char *filename; +#ifndef __minix + int64_t size; +#else + ssize_t size; +#endif + + append_fn = 0; + ar = (struct ar_w *)a->format_data; + ar->is_strtab = 0; + filename = NULL; + size = archive_entry_size(entry); + + + /* + * Reject files with empty name. + */ + pathname = archive_entry_pathname(entry); + if (*pathname == '\0') { + archive_set_error(&a->archive, EINVAL, + "Invalid filename"); + return (ARCHIVE_WARN); + } + + /* + * If we are now at the beginning of the archive, + * we need first write the ar global header. + */ + if (a->archive.file_position == 0) + (a->compressor.write)(a, "!\n", 8); + + memset(buff, ' ', 60); + strncpy(&buff[AR_fmag_offset], "`\n", 2); + + if (strcmp(pathname, "/") == 0 ) { + /* Entry is archive symbol table in GNU format */ + buff[AR_name_offset] = '/'; + goto stat; + } + if (strcmp(pathname, "__.SYMDEF") == 0) { + /* Entry is archive symbol table in BSD format */ + strncpy(buff + AR_name_offset, "__.SYMDEF", 9); + goto stat; + } + if (strcmp(pathname, "//") == 0) { + /* + * Entry is archive filename table, inform that we should + * collect strtab in next _data call. + */ + ar->is_strtab = 1; + buff[AR_name_offset] = buff[AR_name_offset + 1] = '/'; + /* + * For archive string table, only ar_size filed should + * be set. + */ + goto size; + } + + /* + * Otherwise, entry is a normal archive member. + * Strip leading paths from filenames, if any. + */ + if ((filename = ar_basename(pathname)) == NULL) { + /* Reject filenames with trailing "/" */ + archive_set_error(&a->archive, EINVAL, + "Invalid filename"); + return (ARCHIVE_WARN); + } + + if (a->archive.archive_format == ARCHIVE_FORMAT_AR_GNU) { + /* + * SVR4/GNU variant use a "/" to mark then end of the filename, + * make it possible to have embedded spaces in the filename. + * So, the longest filename here (without extension) is + * actually 15 bytes. + */ + if (strlen(filename) <= 15) { + strncpy(&buff[AR_name_offset], + filename, strlen(filename)); + buff[AR_name_offset + strlen(filename)] = '/'; + } else { + /* + * For filename longer than 15 bytes, GNU variant + * makes use of a string table and instead stores the + * offset of the real filename to in the ar_name field. + * The string table should have been written before. + */ + if (ar->has_strtab <= 0) { + archive_set_error(&a->archive, EINVAL, + "Can't find string table"); + return (ARCHIVE_WARN); + } + + se = (char *)malloc(strlen(filename) + 3); + if (se == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate filename buffer"); + return (ARCHIVE_FATAL); + } + + strncpy(se, filename, strlen(filename)); + strcpy(se + strlen(filename), "/\n"); + + ss = strstr(ar->strtab, se); + free(se); + + if (ss == NULL) { + archive_set_error(&a->archive, EINVAL, + "Invalid string table"); + return (ARCHIVE_WARN); + } + + /* + * GNU variant puts "/" followed by digits into + * ar_name field. These digits indicates the real + * filename string's offset to the string table. + */ + buff[AR_name_offset] = '/'; + if (format_decimal(ss - ar->strtab, + buff + AR_name_offset + 1, + AR_name_size - 1)) { + archive_set_error(&a->archive, ERANGE, + "string table offset too large"); + return (ARCHIVE_WARN); + } + } + } else if (a->archive.archive_format == ARCHIVE_FORMAT_AR_BSD) { + /* + * BSD variant: for any file name which is more than + * 16 chars or contains one or more embedded space(s), the + * string "#1/" followed by the ASCII length of the name is + * put into the ar_name field. The file size (stored in the + * ar_size field) is incremented by the length of the name. + * The name is then written immediately following the + * archive header. + */ + if (strlen(filename) <= 16 && strchr(filename, ' ') == NULL) { + strncpy(&buff[AR_name_offset], filename, strlen(filename)); + buff[AR_name_offset + strlen(filename)] = ' '; + } + else { + strncpy(buff + AR_name_offset, "#1/", 3); + if (format_decimal(strlen(filename), + buff + AR_name_offset + 3, + AR_name_size - 3)) { + archive_set_error(&a->archive, ERANGE, + "File name too long"); + return (ARCHIVE_WARN); + } + append_fn = 1; + size += strlen(filename); + } + } + +stat: + if (format_decimal(archive_entry_mtime(entry), buff + AR_date_offset, AR_date_size)) { + archive_set_error(&a->archive, ERANGE, + "File modification time too large"); + return (ARCHIVE_WARN); + } + if (format_decimal(archive_entry_uid(entry), buff + AR_uid_offset, AR_uid_size)) { + archive_set_error(&a->archive, ERANGE, + "Numeric user ID too large"); + return (ARCHIVE_WARN); + } + if (format_decimal(archive_entry_gid(entry), buff + AR_gid_offset, AR_gid_size)) { + archive_set_error(&a->archive, ERANGE, + "Numeric group ID too large"); + return (ARCHIVE_WARN); + } + if (format_octal(archive_entry_mode(entry), buff + AR_mode_offset, AR_mode_size)) { + archive_set_error(&a->archive, ERANGE, + "Numeric mode too large"); + return (ARCHIVE_WARN); + } + /* + * Sanity Check: A non-pseudo archive member should always be + * a regular file. + */ + if (filename != NULL && archive_entry_filetype(entry) != AE_IFREG) { + archive_set_error(&a->archive, EINVAL, + "Regular file required for non-pseudo member"); + return (ARCHIVE_WARN); + } + +size: + if (format_decimal(size, buff + AR_size_offset, AR_size_size)) { + archive_set_error(&a->archive, ERANGE, + "File size out of range"); + return (ARCHIVE_WARN); + } + + ret = (a->compressor.write)(a, buff, 60); + if (ret != ARCHIVE_OK) + return (ret); + + ar->entry_bytes_remaining = size; + ar->entry_padding = ar->entry_bytes_remaining % 2; + + if (append_fn > 0) { + ret = (a->compressor.write)(a, filename, strlen(filename)); + if (ret != ARCHIVE_OK) + return (ret); + ar->entry_bytes_remaining -= strlen(filename); + } + + return (ARCHIVE_OK); +} + +static ssize_t +archive_write_ar_data(struct archive_write *a, const void *buff, size_t s) +{ + struct ar_w *ar; + int ret; + + ar = (struct ar_w *)a->format_data; + if (s > ar->entry_bytes_remaining) + s = ar->entry_bytes_remaining; + + if (ar->is_strtab > 0) { + if (ar->has_strtab > 0) { + archive_set_error(&a->archive, EINVAL, + "More than one string tables exist"); + return (ARCHIVE_WARN); + } + + ar->strtab = (char *)malloc(s); + if (ar->strtab == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate strtab buffer"); + return (ARCHIVE_FATAL); + } + strncpy(ar->strtab, buff, s); + ar->has_strtab = 1; + } + + ret = (a->compressor.write)(a, buff, s); + if (ret != ARCHIVE_OK) + return (ret); + + ar->entry_bytes_remaining -= s; + return (s); +} + +static int +archive_write_ar_destroy(struct archive_write *a) +{ + struct ar_w *ar; + + ar = (struct ar_w *)a->format_data; + + if (ar == NULL) + return (ARCHIVE_OK); + + if (ar->has_strtab > 0) { + free(ar->strtab); + ar->strtab = NULL; + } + + free(ar); + a->format_data = NULL; + return (ARCHIVE_OK); +} + +static int +archive_write_ar_finish(struct archive_write *a) +{ + int ret; + + /* + * If we haven't written anything yet, we need to write + * the ar global header now to make it a valid ar archive. + */ + if (a->archive.file_position == 0) { + ret = (a->compressor.write)(a, "!\n", 8); + return (ret); + } + + return (ARCHIVE_OK); +} + +static int +archive_write_ar_finish_entry(struct archive_write *a) +{ + struct ar_w *ar; + int ret; + + ar = (struct ar_w *)a->format_data; + + if (ar->entry_bytes_remaining != 0) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Entry remaining bytes larger than 0"); + return (ARCHIVE_WARN); + } + + if (ar->entry_padding == 0) { + return (ARCHIVE_OK); + } + + if (ar->entry_padding != 1) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Padding wrong size: %d should be 1 or 0", + ar->entry_padding); + return (ARCHIVE_WARN); + } + + ret = (a->compressor.write)(a, "\n", 1); + return (ret); +} + +/* + * Format a number into the specified field using base-8. + * NB: This version is slightly different from the one in + * _ustar.c + */ +#ifndef __minix +static int +format_octal(int64_t v, char *p, int s) +{ + int len; + char *h; + + len = s; + h = p; + + /* Octal values can't be negative, so use 0. */ + if (v < 0) { + while (len-- > 0) + *p++ = '0'; + return (-1); + } + + p += s; /* Start at the end and work backwards. */ + do { + *--p = (char)('0' + (v & 7)); + v >>= 3; + } while (--s > 0 && v > 0); + + if (v == 0) { + memmove(h, p, len - s); + p = h + len - s; + while (s-- > 0) + *p++ = ' '; + return (0); + } + /* If it overflowed, fill field with max value. */ + while (len-- > 0) + *p++ = '7'; + + return (-1); +} +#else +static int +format_octal(int32_t v, char *p, int s) +{ + int len; + char *h; + + len = s; + h = p; + + /* Octal values can't be negative, so use 0. */ + if (v < 0) { + while (len-- > 0) + *p++ = '0'; + return (-1); + } + + p += s; /* Start at the end and work backwards. */ + do { + *--p = (char)('0' + (v & 7)); + v >>= 3; + } while (--s > 0 && v > 0); + + if (v == 0) { + memmove(h, p, len - s); + p = h + len - s; + while (s-- > 0) + *p++ = ' '; + return (0); + } + /* If it overflowed, fill field with max value. */ + while (len-- > 0) + *p++ = '7'; + + return (-1); +} +#endif + +/* + * Format a number into the specified field using base-10. + */ +#ifndef __minix +static int +format_decimal(int64_t v, char *p, int s) +{ + int len; + char *h; + + len = s; + h = p; + + /* Negative values in ar header are meaningless , so use 0. */ + if (v < 0) { + while (len-- > 0) + *p++ = '0'; + return (-1); + } + + p += s; + do { + *--p = (char)('0' + (v % 10)); + v /= 10; + } while (--s > 0 && v > 0); + + if (v == 0) { + memmove(h, p, len - s); + p = h + len - s; + while (s-- > 0) + *p++ = ' '; + return (0); + } + /* If it overflowed, fill field with max value. */ + while (len-- > 0) + *p++ = '9'; + + return (-1); +} +#else +static int +format_decimal(int32_t v, char *p, int s) +{ + int len; + char *h; + + len = s; + h = p; + + /* Negative values in ar header are meaningless , so use 0. */ + if (v < 0) { + while (len-- > 0) + *p++ = '0'; + return (-1); + } + + p += s; + do { + *--p = (char)('0' + (v % 10)); + v /= 10; + } while (--s > 0 && v > 0); + + if (v == 0) { + memmove(h, p, len - s); + p = h + len - s; + while (s-- > 0) + *p++ = ' '; + return (0); + } + /* If it overflowed, fill field with max value. */ + while (len-- > 0) + *p++ = '9'; + + return (-1); +} +#endif +static const char * +ar_basename(const char *path) +{ + const char *endp, *startp; + + endp = path + strlen(path) - 1; + /* + * For filename with trailing slash(es), we return + * NULL indicating an error. + */ + if (*endp == '/') + return (NULL); + + /* Find the start of the base */ + startp = endp; + while (startp > path && *(startp - 1) != '/') + startp--; + + return (startp); +} diff --git a/lib/libarchive/archive_write_set_format_by_name.c b/lib/libarchive/archive_write_set_format_by_name.c new file mode 100644 index 000000000..783a907a4 --- /dev/null +++ b/lib/libarchive/archive_write_set_format_by_name.c @@ -0,0 +1,78 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_write_set_format_by_name.c 201168 2009-12-29 06:15:32Z kientzle $"); + +#ifdef HAVE_SYS_TYPES_H +#include +#endif + +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif + +#include "archive.h" +#include "archive_private.h" + +/* A table that maps names to functions. */ +static +struct { const char *name; int (*setter)(struct archive *); } names[] = +{ + { "ar", archive_write_set_format_ar_bsd }, + { "arbsd", archive_write_set_format_ar_bsd }, + { "argnu", archive_write_set_format_ar_svr4 }, + { "arsvr4", archive_write_set_format_ar_svr4 }, + { "mtree", archive_write_set_format_mtree }, +#ifndef __minix + { "cpio", archive_write_set_format_cpio }, + { "newc", archive_write_set_format_cpio_newc }, + { "odc", archive_write_set_format_cpio }, +#endif + { "pax", archive_write_set_format_pax }, + { "posix", archive_write_set_format_pax }, + { "shar", archive_write_set_format_shar }, + { "shardump", archive_write_set_format_shar_dump }, + { "ustar", archive_write_set_format_ustar }, + { "zip", archive_write_set_format_zip }, + { NULL, NULL } +}; + +int +archive_write_set_format_by_name(struct archive *a, const char *name) +{ + int i; + + for (i = 0; names[i].name != NULL; i++) { + if (strcmp(name, names[i].name) == 0) + return ((names[i].setter)(a)); + } + + archive_set_error(a, EINVAL, "No such format '%s'", name); + return (ARCHIVE_FATAL); +} diff --git a/lib/libarchive/archive_write_set_format_mtree.c b/lib/libarchive/archive_write_set_format_mtree.c new file mode 100644 index 000000000..15c5cb789 --- /dev/null +++ b/lib/libarchive/archive_write_set_format_mtree.c @@ -0,0 +1,1062 @@ +/*- + * Copyright (c) 2009 Michihiro NAKAJIMA + * Copyright (c) 2008 Joerg Sonnenberger + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_write_set_format_mtree.c 201171 2009-12-29 06:39:07Z kientzle $"); + +#ifdef HAVE_SYS_TYPES_H +#include +#endif +#include +#include +#include + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" +#include "archive_write_private.h" + +#include "archive_hash.h" + +#define INDENTNAMELEN 15 +#define MAXLINELEN 80 + +struct mtree_writer { + struct archive_entry *entry; + struct archive_string ebuf; + struct archive_string buf; + int first; +#ifndef __minix + uint64_t entry_bytes_remaining; +#else + size_t entry_bytes_remaining; +#endif + struct { + int output; + int processed; + struct archive_string parent; + mode_t type; + int keys; + uid_t uid; + gid_t gid; + mode_t mode; + unsigned long fflags_set; + unsigned long fflags_clear; + } set; + /* chekc sum */ + int compute_sum; + uint32_t crc; +#ifndef __minix + uint64_t crc_len; +#else + uint32_t crc_len; +#endif +#ifdef ARCHIVE_HAS_MD5 + archive_md5_ctx md5ctx; +#endif +#ifdef ARCHIVE_HAS_RMD160 + archive_rmd160_ctx rmd160ctx; +#endif +#ifdef ARCHIVE_HAS_SHA1 + archive_sha1_ctx sha1ctx; +#endif +#ifdef ARCHIVE_HAS_SHA256 + archive_sha256_ctx sha256ctx; +#endif +#ifdef ARCHIVE_HAS_SHA384 + archive_sha384_ctx sha384ctx; +#endif +#ifdef ARCHIVE_HAS_SHA512 + archive_sha512_ctx sha512ctx; +#endif + /* Keyword options */ + int keys; +#define F_CKSUM 0x00000001 /* check sum */ +#define F_DEV 0x00000002 /* device type */ +#define F_DONE 0x00000004 /* directory done */ +#define F_FLAGS 0x00000008 /* file flags */ +#define F_GID 0x00000010 /* gid */ +#define F_GNAME 0x00000020 /* group name */ +#define F_IGN 0x00000040 /* ignore */ +#define F_MAGIC 0x00000080 /* name has magic chars */ +#define F_MD5 0x00000100 /* MD5 digest */ +#define F_MODE 0x00000200 /* mode */ +#define F_NLINK 0x00000400 /* number of links */ +#define F_NOCHANGE 0x00000800 /* If owner/mode "wrong", do + * not change */ +#define F_OPT 0x00001000 /* existence optional */ +#define F_RMD160 0x00002000 /* RIPEMD160 digest */ +#define F_SHA1 0x00004000 /* SHA-1 digest */ +#define F_SIZE 0x00008000 /* size */ +#define F_SLINK 0x00010000 /* symbolic link */ +#define F_TAGS 0x00020000 /* tags */ +#define F_TIME 0x00040000 /* modification time */ +#define F_TYPE 0x00080000 /* file type */ +#define F_UID 0x00100000 /* uid */ +#define F_UNAME 0x00200000 /* user name */ +#define F_VISIT 0x00400000 /* file visited */ +#define F_SHA256 0x00800000 /* SHA-256 digest */ +#define F_SHA384 0x01000000 /* SHA-384 digest */ +#define F_SHA512 0x02000000 /* SHA-512 digest */ + + /* Options */ + int dironly; /* if the dironly is 1, ignore everything except + * directory type files. like mtree(8) -d option. + */ + int indent; /* if the indent is 1, indent writing data. */ +}; + +#define DEFAULT_KEYS (F_DEV | F_FLAGS | F_GID | F_GNAME | F_SLINK | F_MODE\ + | F_NLINK | F_SIZE | F_TIME | F_TYPE | F_UID\ + | F_UNAME) + +#define COMPUTE_CRC(var, ch) (var) = (var) << 8 ^ crctab[(var) >> 24 ^ (ch)] +static const uint32_t crctab[] = { + 0x0, + 0x04c11db7, 0x09823b6e, 0x0d4326d9, 0x130476dc, 0x17c56b6b, + 0x1a864db2, 0x1e475005, 0x2608edb8, 0x22c9f00f, 0x2f8ad6d6, + 0x2b4bcb61, 0x350c9b64, 0x31cd86d3, 0x3c8ea00a, 0x384fbdbd, + 0x4c11db70, 0x48d0c6c7, 0x4593e01e, 0x4152fda9, 0x5f15adac, + 0x5bd4b01b, 0x569796c2, 0x52568b75, 0x6a1936c8, 0x6ed82b7f, + 0x639b0da6, 0x675a1011, 0x791d4014, 0x7ddc5da3, 0x709f7b7a, + 0x745e66cd, 0x9823b6e0, 0x9ce2ab57, 0x91a18d8e, 0x95609039, + 0x8b27c03c, 0x8fe6dd8b, 0x82a5fb52, 0x8664e6e5, 0xbe2b5b58, + 0xbaea46ef, 0xb7a96036, 0xb3687d81, 0xad2f2d84, 0xa9ee3033, + 0xa4ad16ea, 0xa06c0b5d, 0xd4326d90, 0xd0f37027, 0xddb056fe, + 0xd9714b49, 0xc7361b4c, 0xc3f706fb, 0xceb42022, 0xca753d95, + 0xf23a8028, 0xf6fb9d9f, 0xfbb8bb46, 0xff79a6f1, 0xe13ef6f4, + 0xe5ffeb43, 0xe8bccd9a, 0xec7dd02d, 0x34867077, 0x30476dc0, + 0x3d044b19, 0x39c556ae, 0x278206ab, 0x23431b1c, 0x2e003dc5, + 0x2ac12072, 0x128e9dcf, 0x164f8078, 0x1b0ca6a1, 0x1fcdbb16, + 0x018aeb13, 0x054bf6a4, 0x0808d07d, 0x0cc9cdca, 0x7897ab07, + 0x7c56b6b0, 0x71159069, 0x75d48dde, 0x6b93dddb, 0x6f52c06c, + 0x6211e6b5, 0x66d0fb02, 0x5e9f46bf, 0x5a5e5b08, 0x571d7dd1, + 0x53dc6066, 0x4d9b3063, 0x495a2dd4, 0x44190b0d, 0x40d816ba, + 0xaca5c697, 0xa864db20, 0xa527fdf9, 0xa1e6e04e, 0xbfa1b04b, + 0xbb60adfc, 0xb6238b25, 0xb2e29692, 0x8aad2b2f, 0x8e6c3698, + 0x832f1041, 0x87ee0df6, 0x99a95df3, 0x9d684044, 0x902b669d, + 0x94ea7b2a, 0xe0b41de7, 0xe4750050, 0xe9362689, 0xedf73b3e, + 0xf3b06b3b, 0xf771768c, 0xfa325055, 0xfef34de2, 0xc6bcf05f, + 0xc27dede8, 0xcf3ecb31, 0xcbffd686, 0xd5b88683, 0xd1799b34, + 0xdc3abded, 0xd8fba05a, 0x690ce0ee, 0x6dcdfd59, 0x608edb80, + 0x644fc637, 0x7a089632, 0x7ec98b85, 0x738aad5c, 0x774bb0eb, + 0x4f040d56, 0x4bc510e1, 0x46863638, 0x42472b8f, 0x5c007b8a, + 0x58c1663d, 0x558240e4, 0x51435d53, 0x251d3b9e, 0x21dc2629, + 0x2c9f00f0, 0x285e1d47, 0x36194d42, 0x32d850f5, 0x3f9b762c, + 0x3b5a6b9b, 0x0315d626, 0x07d4cb91, 0x0a97ed48, 0x0e56f0ff, + 0x1011a0fa, 0x14d0bd4d, 0x19939b94, 0x1d528623, 0xf12f560e, + 0xf5ee4bb9, 0xf8ad6d60, 0xfc6c70d7, 0xe22b20d2, 0xe6ea3d65, + 0xeba91bbc, 0xef68060b, 0xd727bbb6, 0xd3e6a601, 0xdea580d8, + 0xda649d6f, 0xc423cd6a, 0xc0e2d0dd, 0xcda1f604, 0xc960ebb3, + 0xbd3e8d7e, 0xb9ff90c9, 0xb4bcb610, 0xb07daba7, 0xae3afba2, + 0xaafbe615, 0xa7b8c0cc, 0xa379dd7b, 0x9b3660c6, 0x9ff77d71, + 0x92b45ba8, 0x9675461f, 0x8832161a, 0x8cf30bad, 0x81b02d74, + 0x857130c3, 0x5d8a9099, 0x594b8d2e, 0x5408abf7, 0x50c9b640, + 0x4e8ee645, 0x4a4ffbf2, 0x470cdd2b, 0x43cdc09c, 0x7b827d21, + 0x7f436096, 0x7200464f, 0x76c15bf8, 0x68860bfd, 0x6c47164a, + 0x61043093, 0x65c52d24, 0x119b4be9, 0x155a565e, 0x18197087, + 0x1cd86d30, 0x029f3d35, 0x065e2082, 0x0b1d065b, 0x0fdc1bec, + 0x3793a651, 0x3352bbe6, 0x3e119d3f, 0x3ad08088, 0x2497d08d, + 0x2056cd3a, 0x2d15ebe3, 0x29d4f654, 0xc5a92679, 0xc1683bce, + 0xcc2b1d17, 0xc8ea00a0, 0xd6ad50a5, 0xd26c4d12, 0xdf2f6bcb, + 0xdbee767c, 0xe3a1cbc1, 0xe760d676, 0xea23f0af, 0xeee2ed18, + 0xf0a5bd1d, 0xf464a0aa, 0xf9278673, 0xfde69bc4, 0x89b8fd09, + 0x8d79e0be, 0x803ac667, 0x84fbdbd0, 0x9abc8bd5, 0x9e7d9662, + 0x933eb0bb, 0x97ffad0c, 0xafb010b1, 0xab710d06, 0xa6322bdf, + 0xa2f33668, 0xbcb4666d, 0xb8757bda, 0xb5365d03, 0xb1f740b4 +}; + +static int +mtree_safe_char(char c) +{ + if ((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')) + return 1; + if (c >= '0' && c <= '9') + return 1; + if (c == 35 || c == 61 || c == 92) + return 0; /* #, = and \ are always quoted */ + + if (c >= 33 && c <= 47) /* !"$%&'()*+,-./ */ + return 1; + if (c >= 58 && c <= 64) /* :;<>?@ */ + return 1; + if (c >= 91 && c <= 96) /* []^_` */ + return 1; + if (c >= 123 && c <= 126) /* {|}~ */ + return 1; + return 0; +} + +static void +mtree_quote(struct archive_string *s, const char *str) +{ + const char *start; + char buf[4]; + unsigned char c; + + for (start = str; *str != '\0'; ++str) { + if (mtree_safe_char(*str)) + continue; + if (start != str) + archive_strncat(s, start, str - start); + c = (unsigned char)*str; + buf[0] = '\\'; + buf[1] = (c / 64) + '0'; + buf[2] = (c / 8 % 8) + '0'; + buf[3] = (c % 8) + '0'; + archive_strncat(s, buf, 4); + start = str + 1; + } + + if (start != str) + archive_strncat(s, start, str - start); +} + +static void +mtree_indent(struct mtree_writer *mtree) +{ + int i, fn; + const char *r, *s, *x; + + fn = 1; + s = r = mtree->ebuf.s; + x = NULL; + while (*r == ' ') + r++; + while ((r = strchr(r, ' ')) != NULL) { + if (fn) { + fn = 0; + archive_strncat(&mtree->buf, s, r - s); + if (r -s > INDENTNAMELEN) { + archive_strncat(&mtree->buf, " \\\n", 3); + for (i = 0; i < (INDENTNAMELEN + 1); i++) + archive_strappend_char(&mtree->buf, ' '); + } else { + for (i = r -s; i < (INDENTNAMELEN + 1); i++) + archive_strappend_char(&mtree->buf, ' '); + } + s = ++r; + x = NULL; + continue; + } + if (r - s <= MAXLINELEN - 3 - INDENTNAMELEN) + x = r++; + else { + if (x == NULL) + x = r; + archive_strncat(&mtree->buf, s, x - s); + archive_strncat(&mtree->buf, " \\\n", 3); + for (i = 0; i < (INDENTNAMELEN + 1); i++) + archive_strappend_char(&mtree->buf, ' '); + s = r = ++x; + x = NULL; + } + } + if (x != NULL && strlen(s) > MAXLINELEN - 3 - INDENTNAMELEN) { + /* Last keyword is longer. */ + archive_strncat(&mtree->buf, s, x - s); + archive_strncat(&mtree->buf, " \\\n", 3); + for (i = 0; i < (INDENTNAMELEN + 1); i++) + archive_strappend_char(&mtree->buf, ' '); + s = ++x; + } + archive_strcat(&mtree->buf, s); + archive_string_empty(&mtree->ebuf); +} + +#if !defined(_WIN32) || defined(__CYGWIN__) +static size_t +dir_len(struct archive_entry *entry) +{ + const char *path, *r; + + path = archive_entry_pathname(entry); + r = strrchr(path, '/'); + if (r == NULL) + return (0); + /* Include a separator size */ + return (r - path + 1); +} + +#else /* _WIN32 && !__CYGWIN__ */ +/* + * Note: We should use wide-character for findng '\' character, + * a directory separator on Windows, because some character-set have + * been using the '\' character for a part of its multibyte character + * code. + */ +static size_t +dir_len(struct archive_entry *entry) +{ + wchar_t wc; + const char *path; + const char *p, *rp; + size_t al, l, size; + + path = archive_entry_pathname(entry); + al = l = -1; + for (p = path; *p != '\0'; ++p) { + if (*p == '\\') + al = l = p - path; + else if (*p == '/') + al = p - path; + } + if (l == -1) + goto alen; + size = p - path; + rp = p = path; + while (*p != '\0') { + l = mbtowc(&wc, p, size); + if (l == -1) + goto alen; + if (l == 1 && (wc == L'/' || wc == L'\\')) + rp = p; + p += l; + size -= l; + } + return (rp - path + 1); +alen: + if (al == -1) + return (0); + return (al + 1); +} +#endif /* _WIN32 && !__CYGWIN__ */ + +static int +parent_dir_changed(struct archive_string *dir, struct archive_entry *entry) +{ + const char *path; + size_t l; + + l = dir_len(entry); + path = archive_entry_pathname(entry); + if (archive_strlen(dir) > 0) { + if (l == 0) { + archive_string_empty(dir); + return (1); + } + if (strncmp(dir->s, path, l) == 0) + return (0); /* The parent directory is the same. */ + } else if (l == 0) + return (0); /* The parent directory is the same. */ + archive_strncpy(dir, path, l); + return (1); +} + +/* + * Write /set keyword. It means set global datas. + * [directory-only mode] + * - It is only once to write /set keyword. It is using values of the + * first entry. + * [normal mode] + * - Write /set keyword. It is using values of the first entry whose + * filetype is a regular file. + * - When a parent directory of the entry whose filetype is the regular + * file is changed, check the global datas and write it again if its + * values are different from the entry's. + */ +static void +set_global(struct mtree_writer *mtree, struct archive_entry *entry) +{ + struct archive_string setstr; + struct archive_string unsetstr; + const char *name; + int keys, oldkeys, effkeys; + mode_t set_type = 0; + + switch (archive_entry_filetype(entry)) { + case AE_IFLNK: case AE_IFSOCK: case AE_IFCHR: + case AE_IFBLK: case AE_IFIFO: + break; + case AE_IFDIR: + if (mtree->dironly) + set_type = AE_IFDIR; + break; + case AE_IFREG: + default: /* Handle unknown file types as regular files. */ + if (!mtree->dironly) + set_type = AE_IFREG; + break; + } + if (set_type == 0) + return; + if (mtree->set.processed && + !parent_dir_changed(&mtree->set.parent, entry)) + return; + /* At first, save a parent directory of the entry for following + * entries. */ + if (!mtree->set.processed && set_type == AE_IFREG) + parent_dir_changed(&mtree->set.parent, entry); + + archive_string_init(&setstr); + archive_string_init(&unsetstr); + keys = mtree->keys & (F_FLAGS | F_GID | F_GNAME | F_NLINK | F_MODE + | F_TYPE | F_UID | F_UNAME); + oldkeys = mtree->set.keys; + effkeys = keys; + if (mtree->set.processed) { + /* + * Check the global datas for whether it needs updating. + */ + effkeys &= ~F_TYPE; + if ((oldkeys & (F_UNAME | F_UID)) != 0 && + mtree->set.uid == archive_entry_uid(entry)) + effkeys &= ~(F_UNAME | F_UID); + if ((oldkeys & (F_GNAME | F_GID)) != 0 && + mtree->set.gid == archive_entry_gid(entry)) + effkeys &= ~(F_GNAME | F_GID); + if ((oldkeys & F_MODE) != 0 && + mtree->set.mode == (archive_entry_mode(entry) & 07777)) + effkeys &= ~F_MODE; + if ((oldkeys & F_FLAGS) != 0) { + unsigned long fflags_set; + unsigned long fflags_clear; + + archive_entry_fflags(entry, &fflags_set, &fflags_clear); + if (fflags_set == mtree->set.fflags_set && + fflags_clear == mtree->set.fflags_clear) + effkeys &= ~F_FLAGS; + } + } + if ((keys & effkeys & F_TYPE) != 0) { + mtree->set.type = set_type; + if (set_type == AE_IFDIR) + archive_strcat(&setstr, " type=dir"); + else + archive_strcat(&setstr, " type=file"); + } + if ((keys & effkeys & F_UNAME) != 0) { + if ((name = archive_entry_uname(entry)) != NULL) { + archive_strcat(&setstr, " uname="); + mtree_quote(&setstr, name); + } else if ((oldkeys & F_UNAME) != 0) + archive_strcat(&unsetstr, " uname"); + else + keys &= ~F_UNAME; + } + if ((keys & effkeys & F_UID) != 0) { + mtree->set.uid = archive_entry_uid(entry); + archive_string_sprintf(&setstr, " uid=%jd", + (intmax_t)mtree->set.uid); + } + if ((keys & effkeys & F_GNAME) != 0) { + if ((name = archive_entry_gname(entry)) != NULL) { + archive_strcat(&setstr, " gname="); + mtree_quote(&setstr, name); + } else if ((oldkeys & F_GNAME) != 0) + archive_strcat(&unsetstr, " gname"); + else + keys &= ~F_GNAME; + } + if ((keys & effkeys & F_GID) != 0) { + mtree->set.gid = archive_entry_gid(entry); + archive_string_sprintf(&setstr, " gid=%jd", + (intmax_t)mtree->set.gid); + } + if ((keys & effkeys & F_MODE) != 0) { + mtree->set.mode = archive_entry_mode(entry) & 07777; + archive_string_sprintf(&setstr, " mode=%o", mtree->set.mode); + } + if ((keys & effkeys & F_FLAGS) != 0) { + if ((name = archive_entry_fflags_text(entry)) != NULL) { + archive_strcat(&setstr, " flags="); + mtree_quote(&setstr, name); + archive_entry_fflags(entry, &mtree->set.fflags_set, + &mtree->set.fflags_clear); + } else if ((oldkeys & F_FLAGS) != 0) + archive_strcat(&unsetstr, " flags"); + else + keys &= ~F_FLAGS; + } + if (unsetstr.length > 0) + archive_string_sprintf(&mtree->buf, "/unset%s\n", unsetstr.s); + archive_string_free(&unsetstr); + if (setstr.length > 0) + archive_string_sprintf(&mtree->buf, "/set%s\n", setstr.s); + archive_string_free(&setstr); + mtree->set.keys = keys; + mtree->set.processed = 1; + /* On directory-only mode, it is only once to write /set keyword. */ + if (mtree->dironly) + mtree->set.output = 0; +} + +static int +get_keys(struct mtree_writer *mtree, struct archive_entry *entry) +{ + int keys; + + keys = mtree->keys; + if (mtree->set.keys == 0) + return (keys); + if ((mtree->set.keys & (F_GNAME | F_GID)) != 0 && + mtree->set.gid == archive_entry_gid(entry)) + keys &= ~(F_GNAME | F_GID); + if ((mtree->set.keys & (F_UNAME | F_UID)) != 0 && + mtree->set.uid == archive_entry_uid(entry)) + keys &= ~(F_UNAME | F_UID); + if (mtree->set.keys & F_FLAGS) { + unsigned long set, clear; + + archive_entry_fflags(entry, &set, &clear); + if (mtree->set.fflags_set == set && + mtree->set.fflags_clear == clear) + keys &= ~F_FLAGS; + } + if ((mtree->set.keys & F_MODE) != 0 && + mtree->set.mode == (archive_entry_mode(entry) & 07777)) + keys &= ~F_MODE; + + switch (archive_entry_filetype(entry)) { + case AE_IFLNK: case AE_IFSOCK: case AE_IFCHR: + case AE_IFBLK: case AE_IFIFO: + break; + case AE_IFDIR: + if ((mtree->set.keys & F_TYPE) != 0 && + mtree->set.type == AE_IFDIR) + keys &= ~F_TYPE; + break; + case AE_IFREG: + default: /* Handle unknown file types as regular files. */ + if ((mtree->set.keys & F_TYPE) != 0 && + mtree->set.type == AE_IFREG) + keys &= ~F_TYPE; + break; + } + + return (keys); +} + +static int +archive_write_mtree_header(struct archive_write *a, + struct archive_entry *entry) +{ + struct mtree_writer *mtree= a->format_data; + struct archive_string *str; + const char *path; + + mtree->entry = archive_entry_clone(entry); + path = archive_entry_pathname(mtree->entry); + + if (mtree->first) { + mtree->first = 0; + archive_strcat(&mtree->buf, "#mtree\n"); + } + if (mtree->set.output) + set_global(mtree, entry); + + archive_string_empty(&mtree->ebuf); + str = (mtree->indent)? &mtree->ebuf : &mtree->buf; + if (!mtree->dironly || archive_entry_filetype(entry) == AE_IFDIR) + mtree_quote(str, path); + + mtree->entry_bytes_remaining = archive_entry_size(entry); + if ((mtree->keys & F_CKSUM) != 0 && + archive_entry_filetype(entry) == AE_IFREG) { + mtree->compute_sum |= F_CKSUM; + mtree->crc = 0; + mtree->crc_len = 0; + } else + mtree->compute_sum &= ~F_CKSUM; +#ifdef ARCHIVE_HAS_MD5 + if ((mtree->keys & F_MD5) != 0 && + archive_entry_filetype(entry) == AE_IFREG) { + mtree->compute_sum |= F_MD5; + archive_md5_init(&mtree->md5ctx); + } else + mtree->compute_sum &= ~F_MD5; +#endif +#ifdef ARCHIVE_HAS_RMD160 + if ((mtree->keys & F_RMD160) != 0 && + archive_entry_filetype(entry) == AE_IFREG) { + mtree->compute_sum |= F_RMD160; + archive_rmd160_init(&mtree->rmd160ctx); + } else + mtree->compute_sum &= ~F_RMD160; +#endif +#ifdef ARCHIVE_HAS_SHA1 + if ((mtree->keys & F_SHA1) != 0 && + archive_entry_filetype(entry) == AE_IFREG) { + mtree->compute_sum |= F_SHA1; + archive_sha1_init(&mtree->sha1ctx); + } else + mtree->compute_sum &= ~F_SHA1; +#endif +#ifdef ARCHIVE_HAS_SHA256 + if ((mtree->keys & F_SHA256) != 0 && + archive_entry_filetype(entry) == AE_IFREG) { + mtree->compute_sum |= F_SHA256; + archive_sha256_init(&mtree->sha256ctx); + } else + mtree->compute_sum &= ~F_SHA256; +#endif +#ifdef ARCHIVE_HAS_SHA384 + if ((mtree->keys & F_SHA384) != 0 && + archive_entry_filetype(entry) == AE_IFREG) { + mtree->compute_sum |= F_SHA384; + archive_sha384_init(&mtree->sha384ctx); + } else + mtree->compute_sum &= ~F_SHA384; +#endif +#ifdef ARCHIVE_HAS_SHA512 + if ((mtree->keys & F_SHA512) != 0 && + archive_entry_filetype(entry) == AE_IFREG) { + mtree->compute_sum |= F_SHA512; + archive_sha512_init(&mtree->sha512ctx); + } else + mtree->compute_sum &= ~F_SHA512; +#endif + + return (ARCHIVE_OK); +} + +#if defined(ARCHIVE_HAS_MD5) || defined(ARCHIVE_HAS_RMD160) || \ + defined(ARCHIVE_HAS_SHA1) || defined(ARCHIVE_HAS_SHA256) || \ + defined(ARCHIVE_HAS_SHA384) || defined(ARCHIVE_HAS_SHA512) +static void +strappend_bin(struct archive_string *s, const unsigned char *bin, int n) +{ + static const char hex[] = "0123456789abcdef"; + int i; + + for (i = 0; i < n; i++) { + archive_strappend_char(s, hex[bin[i] >> 4]); + archive_strappend_char(s, hex[bin[i] & 0x0f]); + } +} +#endif + +static int +archive_write_mtree_finish_entry(struct archive_write *a) +{ + struct mtree_writer *mtree = a->format_data; + struct archive_entry *entry; + struct archive_string *str; + const char *name; + int keys, ret; + + entry = mtree->entry; + if (entry == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_PROGRAMMER, + "Finished entry without being open first."); + return (ARCHIVE_FATAL); + } + mtree->entry = NULL; + + if (mtree->dironly && archive_entry_filetype(entry) != AE_IFDIR) { + archive_entry_free(entry); + return (ARCHIVE_OK); + } + + str = (mtree->indent)? &mtree->ebuf : &mtree->buf; + keys = get_keys(mtree, entry); + if ((keys & F_NLINK) != 0 && + archive_entry_nlink(entry) != 1 && + archive_entry_filetype(entry) != AE_IFDIR) + archive_string_sprintf(str, + " nlink=%u", archive_entry_nlink(entry)); + + if ((keys & F_GNAME) != 0 && + (name = archive_entry_gname(entry)) != NULL) { + archive_strcat(str, " gname="); + mtree_quote(str, name); + } + if ((keys & F_UNAME) != 0 && + (name = archive_entry_uname(entry)) != NULL) { + archive_strcat(str, " uname="); + mtree_quote(str, name); + } + if ((keys & F_FLAGS) != 0 && + (name = archive_entry_fflags_text(entry)) != NULL) { + archive_strcat(str, " flags="); + mtree_quote(str, name); + } + if ((keys & F_TIME) != 0) + archive_string_sprintf(str, " time=%jd.%jd", + (intmax_t)archive_entry_mtime(entry), + (intmax_t)archive_entry_mtime_nsec(entry)); + if ((keys & F_MODE) != 0) + archive_string_sprintf(str, " mode=%o", + archive_entry_mode(entry) & 07777); + if ((keys & F_GID) != 0) + archive_string_sprintf(str, " gid=%jd", + (intmax_t)archive_entry_gid(entry)); + if ((keys & F_UID) != 0) + archive_string_sprintf(str, " uid=%jd", + (intmax_t)archive_entry_uid(entry)); + + switch (archive_entry_filetype(entry)) { + case AE_IFLNK: + if ((keys & F_TYPE) != 0) + archive_strcat(str, " type=link"); + if ((keys & F_SLINK) != 0) { + archive_strcat(str, " link="); + mtree_quote(str, archive_entry_symlink(entry)); + } + break; + case AE_IFSOCK: + if ((keys & F_TYPE) != 0) + archive_strcat(str, " type=socket"); + break; + case AE_IFCHR: + if ((keys & F_TYPE) != 0) + archive_strcat(str, " type=char"); + if ((keys & F_DEV) != 0) { + archive_string_sprintf(str, + " device=native,%d,%d", + archive_entry_rdevmajor(entry), + archive_entry_rdevminor(entry)); + } + break; + case AE_IFBLK: + if ((keys & F_TYPE) != 0) + archive_strcat(str, " type=block"); + if ((keys & F_DEV) != 0) { + archive_string_sprintf(str, + " device=native,%d,%d", + archive_entry_rdevmajor(entry), + archive_entry_rdevminor(entry)); + } + break; + case AE_IFDIR: + if ((keys & F_TYPE) != 0) + archive_strcat(str, " type=dir"); + break; + case AE_IFIFO: + if ((keys & F_TYPE) != 0) + archive_strcat(str, " type=fifo"); + break; + case AE_IFREG: + default: /* Handle unknown file types as regular files. */ + if ((keys & F_TYPE) != 0) + archive_strcat(str, " type=file"); + if ((keys & F_SIZE) != 0) + archive_string_sprintf(str, " size=%jd", + (intmax_t)archive_entry_size(entry)); + break; + } + + if (mtree->compute_sum & F_CKSUM) { +#ifndef __minix + uint64_t len; +#else + uint32_t len; +#endif + /* Include the length of the file. */ + for (len = mtree->crc_len; len != 0; len >>= 8) + COMPUTE_CRC(mtree->crc, len & 0xff); + mtree->crc = ~mtree->crc; + archive_string_sprintf(str, " cksum=%ju", + (uintmax_t)mtree->crc); + } +#ifdef ARCHIVE_HAS_MD5 + if (mtree->compute_sum & F_MD5) { + unsigned char buf[16]; + + archive_md5_final(&mtree->md5ctx, buf); + archive_strcat(str, " md5digest="); + strappend_bin(str, buf, sizeof(buf)); + } +#endif +#ifdef ARCHIVE_HAS_RMD160 + if (mtree->compute_sum & F_RMD160) { + unsigned char buf[20]; + + archive_rmd160_final(&mtree->rmd160ctx, buf); + archive_strcat(str, " rmd160digest="); + strappend_bin(str, buf, sizeof(buf)); + } +#endif +#ifdef ARCHIVE_HAS_SHA1 + if (mtree->compute_sum & F_SHA1) { + unsigned char buf[20]; + + archive_sha1_final(&mtree->sha1ctx, buf); + archive_strcat(str, " sha1digest="); + strappend_bin(str, buf, sizeof(buf)); + } +#endif +#ifdef ARCHIVE_HAS_SHA256 + if (mtree->compute_sum & F_SHA256) { + unsigned char buf[32]; + + archive_sha256_final(&mtree->sha256ctx, buf); + archive_strcat(str, " sha256digest="); + strappend_bin(str, buf, sizeof(buf)); + } +#endif +#ifdef ARCHIVE_HAS_SHA384 + if (mtree->compute_sum & F_SHA384) { + unsigned char buf[48]; + + archive_sha384_final(&mtree->sha384ctx, buf); + archive_strcat(str, " sha384digest="); + strappend_bin(str, buf, sizeof(buf)); + } +#endif +#ifdef ARCHIVE_HAS_SHA512 + if (mtree->compute_sum & F_SHA512) { + unsigned char buf[64]; + + archive_sha512_final(&mtree->sha512ctx, buf); + archive_strcat(str, " sha512digest="); + strappend_bin(str, buf, sizeof(buf)); + } +#endif + archive_strcat(str, "\n"); + if (mtree->indent) + mtree_indent(mtree); + + archive_entry_free(entry); + + if (mtree->buf.length > 32768) { + ret = (a->compressor.write)(a, mtree->buf.s, mtree->buf.length); + archive_string_empty(&mtree->buf); + } else + ret = ARCHIVE_OK; + + return (ret == ARCHIVE_OK ? ret : ARCHIVE_FATAL); +} + +static int +archive_write_mtree_finish(struct archive_write *a) +{ + struct mtree_writer *mtree= a->format_data; + + archive_write_set_bytes_in_last_block(&a->archive, 1); + + return (a->compressor.write)(a, mtree->buf.s, mtree->buf.length); +} + +static ssize_t +archive_write_mtree_data(struct archive_write *a, const void *buff, size_t n) +{ + struct mtree_writer *mtree= a->format_data; + + if (n > mtree->entry_bytes_remaining) + n = mtree->entry_bytes_remaining; + if (mtree->dironly) + /* We don't need compute a regular file sum */ + return (n); + if (mtree->compute_sum & F_CKSUM) { + /* + * Compute a POSIX 1003.2 checksum + */ + const unsigned char *p; + size_t nn; + + for (nn = n, p = buff; nn--; ++p) + COMPUTE_CRC(mtree->crc, *p); + mtree->crc_len += n; + } +#ifdef ARCHIVE_HAS_MD5 + if (mtree->compute_sum & F_MD5) + archive_md5_update(&mtree->md5ctx, buff, n); +#endif +#ifdef ARCHIVE_HAS_RMD160 + if (mtree->compute_sum & F_RMD160) + archive_rmd160_update(&mtree->rmd160ctx, buff, n); +#endif +#ifdef ARCHIVE_HAS_SHA1 + if (mtree->compute_sum & F_SHA1) + archive_sha1_update(&mtree->sha1ctx, buff, n); +#endif +#ifdef ARCHIVE_HAS_SHA256 + if (mtree->compute_sum & F_SHA256) + archive_sha256_update(&mtree->sha256ctx, buff, n); +#endif +#ifdef ARCHIVE_HAS_SHA384 + if (mtree->compute_sum & F_SHA384) + archive_sha384_update(&mtree->sha384ctx, buff, n); +#endif +#ifdef ARCHIVE_HAS_SHA512 + if (mtree->compute_sum & F_SHA512) + archive_sha512_update(&mtree->sha512ctx, buff, n); +#endif + return (n); +} + +static int +archive_write_mtree_destroy(struct archive_write *a) +{ + struct mtree_writer *mtree= a->format_data; + + if (mtree == NULL) + return (ARCHIVE_OK); + + archive_entry_free(mtree->entry); + archive_string_free(&mtree->ebuf); + archive_string_free(&mtree->buf); + archive_string_free(&mtree->set.parent); + free(mtree); + a->format_data = NULL; + return (ARCHIVE_OK); +} + +static int +archive_write_mtree_options(struct archive_write *a, const char *key, + const char *value) +{ + struct mtree_writer *mtree= a->format_data; + int keybit = 0; + + switch (key[0]) { + case 'a': + if (strcmp(key, "all") == 0) + keybit = ~0; + break; + case 'c': + if (strcmp(key, "cksum") == 0) + keybit = F_CKSUM; + break; + case 'd': + if (strcmp(key, "device") == 0) + keybit = F_DEV; + else if (strcmp(key, "dironly") == 0) { + mtree->dironly = (value != NULL)? 1: 0; + return (ARCHIVE_OK); + } + break; + case 'f': + if (strcmp(key, "flags") == 0) + keybit = F_FLAGS; + break; + case 'g': + if (strcmp(key, "gid") == 0) + keybit = F_GID; + else if (strcmp(key, "gname") == 0) + keybit = F_GNAME; + break; + case 'i': + if (strcmp(key, "indent") == 0) { + mtree->indent = (value != NULL)? 1: 0; + return (ARCHIVE_OK); + } + break; + case 'l': + if (strcmp(key, "link") == 0) + keybit = F_SLINK; + break; + case 'm': + if (strcmp(key, "md5") == 0 || + strcmp(key, "md5digest") == 0) + keybit = F_MD5; + if (strcmp(key, "mode") == 0) + keybit = F_MODE; + break; + case 'n': + if (strcmp(key, "nlink") == 0) + keybit = F_NLINK; + break; + case 'r': + if (strcmp(key, "ripemd160digest") == 0 || + strcmp(key, "rmd160") == 0 || + strcmp(key, "rmd160digest") == 0) + keybit = F_RMD160; + break; + case 's': + if (strcmp(key, "sha1") == 0 || + strcmp(key, "sha1digest") == 0) + keybit = F_SHA1; + if (strcmp(key, "sha256") == 0 || + strcmp(key, "sha256digest") == 0) + keybit = F_SHA256; + if (strcmp(key, "sha384") == 0 || + strcmp(key, "sha384digest") == 0) + keybit = F_SHA384; + if (strcmp(key, "sha512") == 0 || + strcmp(key, "sha512digest") == 0) + keybit = F_SHA512; + if (strcmp(key, "size") == 0) + keybit = F_SIZE; + break; + case 't': + if (strcmp(key, "time") == 0) + keybit = F_TIME; + else if (strcmp(key, "type") == 0) + keybit = F_TYPE; + break; + case 'u': + if (strcmp(key, "uid") == 0) + keybit = F_UID; + else if (strcmp(key, "uname") == 0) + keybit = F_UNAME; + else if (strcmp(key, "use-set") == 0) { + mtree->set.output = (value != NULL)? 1: 0; + return (ARCHIVE_OK); + } + break; + } + if (keybit != 0) { + if (value != NULL) + mtree->keys |= keybit; + else + mtree->keys &= ~keybit; + return (ARCHIVE_OK); + } + + return (ARCHIVE_WARN); +} + +int +archive_write_set_format_mtree(struct archive *_a) +{ + struct archive_write *a = (struct archive_write *)_a; + struct mtree_writer *mtree; + + if (a->format_destroy != NULL) + (a->format_destroy)(a); + + if ((mtree = malloc(sizeof(*mtree))) == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate mtree data"); + return (ARCHIVE_FATAL); + } + + mtree->entry = NULL; + mtree->first = 1; + memset(&(mtree->set), 0, sizeof(mtree->set)); + archive_string_init(&mtree->set.parent); + mtree->keys = DEFAULT_KEYS; + mtree->dironly = 0; + mtree->indent = 0; + archive_string_init(&mtree->ebuf); + archive_string_init(&mtree->buf); + a->format_data = mtree; + a->format_destroy = archive_write_mtree_destroy; + + a->pad_uncompressed = 0; + a->format_name = "mtree"; + a->format_options = archive_write_mtree_options; + a->format_write_header = archive_write_mtree_header; + a->format_finish = archive_write_mtree_finish; + a->format_write_data = archive_write_mtree_data; + a->format_finish_entry = archive_write_mtree_finish_entry; + a->archive.archive_format = ARCHIVE_FORMAT_MTREE; + a->archive.archive_format_name = "mtree"; + + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_write_set_format_pax.c b/lib/libarchive/archive_write_set_format_pax.c new file mode 100644 index 000000000..eab20c2b2 --- /dev/null +++ b/lib/libarchive/archive_write_set_format_pax.c @@ -0,0 +1,1497 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_write_set_format_pax.c 201162 2009-12-29 05:47:46Z kientzle $"); + +#ifdef HAVE_ERRNO_H +#include +#endif +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" +#include "archive_write_private.h" + +#ifndef __minix +struct pax { + uint64_t entry_bytes_remaining; + uint64_t entry_padding; + struct archive_string pax_header; +}; +#else +struct pax { + size_t entry_bytes_remaining; + off_t entry_padding; + struct archive_string pax_header; +}; +#endif +static void add_pax_attr(struct archive_string *, const char *key, + const char *value); +#ifndef __minix +static void add_pax_attr_int(struct archive_string *, + const char *key, int64_t value); +static void add_pax_attr_time(struct archive_string *, + const char *key, int64_t sec, + unsigned long nanos); +#else +static void add_pax_attr_int(struct archive_string *, + const char *key, int32_t value); +static void add_pax_attr_time(struct archive_string *, + const char *key, time_t sec, + unsigned long nanos); +#endif +static void add_pax_attr_w(struct archive_string *, + const char *key, const wchar_t *wvalue); +static ssize_t archive_write_pax_data(struct archive_write *, + const void *, size_t); +static int archive_write_pax_finish(struct archive_write *); +static int archive_write_pax_destroy(struct archive_write *); +static int archive_write_pax_finish_entry(struct archive_write *); +static int archive_write_pax_header(struct archive_write *, + struct archive_entry *); +static char *base64_encode(const char *src, size_t len); +static char *build_pax_attribute_name(char *dest, const char *src); +static char *build_ustar_entry_name(char *dest, const char *src, + size_t src_length, const char *insert); +#ifndef __minix +static char *format_int(char *dest, int64_t); +#else +static char *format_int(char *dest, int32_t); +#endif +static int has_non_ASCII(const wchar_t *); +static char *url_encode(const char *in); +static int write_nulls(struct archive_write *, size_t); + +/* + * Set output format to 'restricted pax' format. + * + * This is the same as normal 'pax', but tries to suppress + * the pax header whenever possible. This is the default for + * bsdtar, for instance. + */ +int +archive_write_set_format_pax_restricted(struct archive *_a) +{ + struct archive_write *a = (struct archive_write *)_a; + int r; + r = archive_write_set_format_pax(&a->archive); + a->archive.archive_format = ARCHIVE_FORMAT_TAR_PAX_RESTRICTED; + a->archive.archive_format_name = "restricted POSIX pax interchange"; + return (r); +} + +/* + * Set output format to 'pax' format. + */ +int +archive_write_set_format_pax(struct archive *_a) +{ + struct archive_write *a = (struct archive_write *)_a; + struct pax *pax; + + if (a->format_destroy != NULL) + (a->format_destroy)(a); + + pax = (struct pax *)malloc(sizeof(*pax)); + if (pax == NULL) { + archive_set_error(&a->archive, ENOMEM, "Can't allocate pax data"); + return (ARCHIVE_FATAL); + } + memset(pax, 0, sizeof(*pax)); + a->format_data = pax; + + a->pad_uncompressed = 1; + a->format_name = "pax"; + a->format_write_header = archive_write_pax_header; + a->format_write_data = archive_write_pax_data; + a->format_finish = archive_write_pax_finish; + a->format_destroy = archive_write_pax_destroy; + a->format_finish_entry = archive_write_pax_finish_entry; + a->archive.archive_format = ARCHIVE_FORMAT_TAR_PAX_INTERCHANGE; + a->archive.archive_format_name = "POSIX pax interchange"; + return (ARCHIVE_OK); +} + +/* + * Note: This code assumes that 'nanos' has the same sign as 'sec', + * which implies that sec=-1, nanos=200000000 represents -1.2 seconds + * and not -0.8 seconds. This is a pretty pedantic point, as we're + * unlikely to encounter many real files created before Jan 1, 1970, + * much less ones with timestamps recorded to sub-second resolution. + */ +#ifndef __minix +static void +add_pax_attr_time(struct archive_string *as, const char *key, + int64_t sec, unsigned long nanos) +{ + int digit, i; + char *t; + /* + * Note that each byte contributes fewer than 3 base-10 + * digits, so this will always be big enough. + */ + char tmp[1 + 3*sizeof(sec) + 1 + 3*sizeof(nanos)]; + + tmp[sizeof(tmp) - 1] = 0; + t = tmp + sizeof(tmp) - 1; + + /* Skip trailing zeros in the fractional part. */ + for (digit = 0, i = 10; i > 0 && digit == 0; i--) { + digit = nanos % 10; + nanos /= 10; + } + + /* Only format the fraction if it's non-zero. */ + if (i > 0) { + while (i > 0) { + *--t = "0123456789"[digit]; + digit = nanos % 10; + nanos /= 10; + i--; + } + *--t = '.'; + } + t = format_int(t, sec); + + add_pax_attr(as, key, t); +} +#else +static void +add_pax_attr_time(struct archive_string *as, const char *key, + time_t sec, unsigned long nanos) +{ + int digit, i; + char *t; + /* + * Note that each byte contributes fewer than 3 base-10 + * digits, so this will always be big enough. + */ + char tmp[1 + 3*sizeof(sec) + 1 + 3*sizeof(nanos)]; + + tmp[sizeof(tmp) - 1] = 0; + t = tmp + sizeof(tmp) - 1; + + /* Skip trailing zeros in the fractional part. */ + for (digit = 0, i = 10; i > 0 && digit == 0; i--) { + digit = nanos % 10; + nanos /= 10; + } + + /* Only format the fraction if it's non-zero. */ + if (i > 0) { + while (i > 0) { + *--t = "0123456789"[digit]; + digit = nanos % 10; + nanos /= 10; + i--; + } + *--t = '.'; + } + t = format_int(t, sec); + + add_pax_attr(as, key, t); +} +#endif + +#ifndef __minix +static char * +format_int(char *t, int64_t i) +{ + int sign; + + if (i < 0) { + sign = -1; + i = -i; + } else + sign = 1; + + do { + *--t = "0123456789"[i % 10]; + } while (i /= 10); + if (sign < 0) + *--t = '-'; + return (t); +} +#else +static char * +format_int(char *t, int32_t i) +{ + int sign; + + if (i < 0) { + sign = -1; + i = -i; + } else + sign = 1; + + do { + *--t = "0123456789"[i % 10]; + } while (i /= 10); + if (sign < 0) + *--t = '-'; + return (t); +} +#endif + +#ifndef __minix +static void +add_pax_attr_int(struct archive_string *as, const char *key, int64_t value) +{ + char tmp[1 + 3 * sizeof(value)]; + + tmp[sizeof(tmp) - 1] = 0; + add_pax_attr(as, key, format_int(tmp + sizeof(tmp) - 1, value)); +} +#else +static void +add_pax_attr_int(struct archive_string *as, const char *key, int32_t value) +{ + char tmp[1 + 3 * sizeof(value)]; + + tmp[sizeof(tmp) - 1] = 0; + add_pax_attr(as, key, format_int(tmp + sizeof(tmp) - 1, value)); +} +#endif + +static char * +utf8_encode(const wchar_t *wval) +{ + int utf8len; + const wchar_t *wp; + unsigned long wc; + char *utf8_value, *p; + + utf8len = 0; + for (wp = wval; *wp != L'\0'; ) { + wc = *wp++; + + if (wc >= 0xd800 && wc <= 0xdbff + && *wp >= 0xdc00 && *wp <= 0xdfff) { + /* This is a surrogate pair. Combine into a + * full Unicode value before encoding into + * UTF-8. */ + wc = (wc - 0xd800) << 10; /* High 10 bits */ + wc += (*wp++ - 0xdc00); /* Low 10 bits */ + wc += 0x10000; /* Skip BMP */ + } + if (wc <= 0x7f) + utf8len++; + else if (wc <= 0x7ff) + utf8len += 2; + else if (wc <= 0xffff) + utf8len += 3; + else if (wc <= 0x1fffff) + utf8len += 4; + else if (wc <= 0x3ffffff) + utf8len += 5; + else if (wc <= 0x7fffffff) + utf8len += 6; + /* Ignore larger values; UTF-8 can't encode them. */ + } + + utf8_value = (char *)malloc(utf8len + 1); + if (utf8_value == NULL) { + __archive_errx(1, "Not enough memory for attributes"); + return (NULL); + } + + for (wp = wval, p = utf8_value; *wp != L'\0'; ) { + wc = *wp++; + if (wc >= 0xd800 && wc <= 0xdbff + && *wp >= 0xdc00 && *wp <= 0xdfff) { + /* Combine surrogate pair. */ + wc = (wc - 0xd800) << 10; + wc += *wp++ - 0xdc00 + 0x10000; + } + if (wc <= 0x7f) { + *p++ = (char)wc; + } else if (wc <= 0x7ff) { + p[0] = 0xc0 | ((wc >> 6) & 0x1f); + p[1] = 0x80 | (wc & 0x3f); + p += 2; + } else if (wc <= 0xffff) { + p[0] = 0xe0 | ((wc >> 12) & 0x0f); + p[1] = 0x80 | ((wc >> 6) & 0x3f); + p[2] = 0x80 | (wc & 0x3f); + p += 3; + } else if (wc <= 0x1fffff) { + p[0] = 0xf0 | ((wc >> 18) & 0x07); + p[1] = 0x80 | ((wc >> 12) & 0x3f); + p[2] = 0x80 | ((wc >> 6) & 0x3f); + p[3] = 0x80 | (wc & 0x3f); + p += 4; + } else if (wc <= 0x3ffffff) { + p[0] = 0xf8 | ((wc >> 24) & 0x03); + p[1] = 0x80 | ((wc >> 18) & 0x3f); + p[2] = 0x80 | ((wc >> 12) & 0x3f); + p[3] = 0x80 | ((wc >> 6) & 0x3f); + p[4] = 0x80 | (wc & 0x3f); + p += 5; + } else if (wc <= 0x7fffffff) { + p[0] = 0xfc | ((wc >> 30) & 0x01); + p[1] = 0x80 | ((wc >> 24) & 0x3f); + p[1] = 0x80 | ((wc >> 18) & 0x3f); + p[2] = 0x80 | ((wc >> 12) & 0x3f); + p[3] = 0x80 | ((wc >> 6) & 0x3f); + p[4] = 0x80 | (wc & 0x3f); + p += 6; + } + /* Ignore larger values; UTF-8 can't encode them. */ + } + *p = '\0'; + + return (utf8_value); +} + +static void +add_pax_attr_w(struct archive_string *as, const char *key, const wchar_t *wval) +{ + char *utf8_value = utf8_encode(wval); + if (utf8_value == NULL) + return; + add_pax_attr(as, key, utf8_value); + free(utf8_value); +} + +/* + * Add a key/value attribute to the pax header. This function handles + * the length field and various other syntactic requirements. + */ +static void +add_pax_attr(struct archive_string *as, const char *key, const char *value) +{ + int digits, i, len, next_ten; + char tmp[1 + 3 * sizeof(int)]; /* < 3 base-10 digits per byte */ + + /*- + * PAX attributes have the following layout: + * <=> + */ + len = 1 + (int)strlen(key) + 1 + (int)strlen(value) + 1; + + /* + * The field includes the length of the field, so + * computing the correct length is tricky. I start by + * counting the number of base-10 digits in 'len' and + * computing the next higher power of 10. + */ + next_ten = 1; + digits = 0; + i = len; + while (i > 0) { + i = i / 10; + digits++; + next_ten = next_ten * 10; + } + /* + * For example, if string without the length field is 99 + * chars, then adding the 2 digit length "99" will force the + * total length past 100, requiring an extra digit. The next + * statement adjusts for this effect. + */ + if (len + digits >= next_ten) + digits++; + + /* Now, we have the right length so we can build the line. */ + tmp[sizeof(tmp) - 1] = 0; /* Null-terminate the work area. */ + archive_strcat(as, format_int(tmp + sizeof(tmp) - 1, len + digits)); + archive_strappend_char(as, ' '); + archive_strcat(as, key); + archive_strappend_char(as, '='); + archive_strcat(as, value); + archive_strappend_char(as, '\n'); +} + +static void +archive_write_pax_header_xattrs(struct pax *pax, struct archive_entry *entry) +{ + struct archive_string s; + int i = archive_entry_xattr_reset(entry); + + while (i--) { + const char *name; + const void *value; + char *encoded_value; + char *url_encoded_name = NULL, *encoded_name = NULL; + wchar_t *wcs_name = NULL; + size_t size; + + archive_entry_xattr_next(entry, &name, &value, &size); + /* Name is URL-encoded, then converted to wchar_t, + * then UTF-8 encoded. */ + url_encoded_name = url_encode(name); + if (url_encoded_name != NULL) { + /* Convert narrow-character to wide-character. */ + size_t wcs_length = strlen(url_encoded_name); + wcs_name = (wchar_t *)malloc((wcs_length + 1) * sizeof(wchar_t)); + if (wcs_name == NULL) + __archive_errx(1, "No memory for xattr conversion"); + mbstowcs(wcs_name, url_encoded_name, wcs_length); + wcs_name[wcs_length] = 0; + free(url_encoded_name); /* Done with this. */ + } + if (wcs_name != NULL) { + encoded_name = utf8_encode(wcs_name); + free(wcs_name); /* Done with wchar_t name. */ + } + + encoded_value = base64_encode((const char *)value, size); + + if (encoded_name != NULL && encoded_value != NULL) { + archive_string_init(&s); + archive_strcpy(&s, "LIBARCHIVE.xattr."); + archive_strcat(&s, encoded_name); + add_pax_attr(&(pax->pax_header), s.s, encoded_value); + archive_string_free(&s); + } + free(encoded_name); + free(encoded_value); + } +} + +/* + * TODO: Consider adding 'comment' and 'charset' fields to + * archive_entry so that clients can specify them. Also, consider + * adding generic key/value tags so clients can add arbitrary + * key/value data. + */ +static int +archive_write_pax_header(struct archive_write *a, + struct archive_entry *entry_original) +{ + struct archive_entry *entry_main; + const char *p; + char *t; + const wchar_t *wp; + const char *suffix; + int need_extension, r, ret; + struct pax *pax; + const char *hdrcharset = NULL; + const char *hardlink; + const char *path = NULL, *linkpath = NULL; + const char *uname = NULL, *gname = NULL; + const wchar_t *path_w = NULL, *linkpath_w = NULL; + const wchar_t *uname_w = NULL, *gname_w = NULL; + + char paxbuff[512]; + char ustarbuff[512]; + char ustar_entry_name[256]; + char pax_entry_name[256]; + + ret = ARCHIVE_OK; + need_extension = 0; + pax = (struct pax *)a->format_data; + + hardlink = archive_entry_hardlink(entry_original); + + /* Make sure this is a type of entry that we can handle here */ + if (hardlink == NULL) { + switch (archive_entry_filetype(entry_original)) { + case AE_IFBLK: + case AE_IFCHR: + case AE_IFIFO: + case AE_IFLNK: + case AE_IFREG: + break; + case AE_IFDIR: + /* + * Ensure a trailing '/'. Modify the original + * entry so the client sees the change. + */ + p = archive_entry_pathname(entry_original); + if (p[strlen(p) - 1] != '/') { + t = (char *)malloc(strlen(p) + 2); + if (t == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate pax data"); + return(ARCHIVE_FATAL); + } + strcpy(t, p); + strcat(t, "/"); + archive_entry_copy_pathname(entry_original, t); + free(t); + } + break; + case AE_IFSOCK: + archive_set_error(&a->archive, + ARCHIVE_ERRNO_FILE_FORMAT, + "tar format cannot archive socket"); + return (ARCHIVE_WARN); + default: + archive_set_error(&a->archive, + ARCHIVE_ERRNO_FILE_FORMAT, + "tar format cannot archive this (type=0%lo)", + (unsigned long)archive_entry_filetype(entry_original)); + return (ARCHIVE_WARN); + } + } + + /* Copy entry so we can modify it as needed. */ + entry_main = archive_entry_clone(entry_original); + archive_string_empty(&(pax->pax_header)); /* Blank our work area. */ + + /* + * First, check the name fields and see if any of them + * require binary coding. If any of them does, then all of + * them do. + */ + hdrcharset = NULL; + path = archive_entry_pathname(entry_main); + path_w = archive_entry_pathname_w(entry_main); + if (path != NULL && path_w == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Can't translate pathname '%s' to UTF-8", path); + ret = ARCHIVE_WARN; + hdrcharset = "BINARY"; + } + uname = archive_entry_uname(entry_main); + uname_w = archive_entry_uname_w(entry_main); + if (uname != NULL && uname_w == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Can't translate uname '%s' to UTF-8", uname); + ret = ARCHIVE_WARN; + hdrcharset = "BINARY"; + } + gname = archive_entry_gname(entry_main); + gname_w = archive_entry_gname_w(entry_main); + if (gname != NULL && gname_w == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Can't translate gname '%s' to UTF-8", gname); + ret = ARCHIVE_WARN; + hdrcharset = "BINARY"; + } + linkpath = hardlink; + if (linkpath != NULL) { + linkpath_w = archive_entry_hardlink_w(entry_main); + } else { + linkpath = archive_entry_symlink(entry_main); + if (linkpath != NULL) + linkpath_w = archive_entry_symlink_w(entry_main); + } + if (linkpath != NULL && linkpath_w == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_FILE_FORMAT, + "Can't translate linkpath '%s' to UTF-8", linkpath); + ret = ARCHIVE_WARN; + hdrcharset = "BINARY"; + } + + /* Store the header encoding first, to be nice to readers. */ + if (hdrcharset != NULL) + add_pax_attr(&(pax->pax_header), "hdrcharset", hdrcharset); + + + /* + * If name is too long, or has non-ASCII characters, add + * 'path' to pax extended attrs. (Note that an unconvertible + * name must have non-ASCII characters.) + */ + if (path == NULL) { + /* We don't have a narrow version, so we have to store + * the wide version. */ + add_pax_attr_w(&(pax->pax_header), "path", path_w); + archive_entry_set_pathname(entry_main, "@WidePath"); + need_extension = 1; + } else if (has_non_ASCII(path_w)) { + /* We have non-ASCII characters. */ + if (path_w == NULL || hdrcharset != NULL) { + /* Can't do UTF-8, so store it raw. */ + add_pax_attr(&(pax->pax_header), "path", path); + } else { + /* Store UTF-8 */ + add_pax_attr_w(&(pax->pax_header), + "path", path_w); + } + archive_entry_set_pathname(entry_main, + build_ustar_entry_name(ustar_entry_name, + path, strlen(path), NULL)); + need_extension = 1; + } else { + /* We have an all-ASCII path; we'd like to just store + * it in the ustar header if it will fit. Yes, this + * duplicates some of the logic in + * write_set_format_ustar.c + */ + if (strlen(path) <= 100) { + /* Fits in the old 100-char tar name field. */ + } else { + /* Find largest suffix that will fit. */ + /* Note: strlen() > 100, so strlen() - 100 - 1 >= 0 */ + suffix = strchr(path + strlen(path) - 100 - 1, '/'); + /* Don't attempt an empty prefix. */ + if (suffix == path) + suffix = strchr(suffix + 1, '/'); + /* We can put it in the ustar header if it's + * all ASCII and it's either <= 100 characters + * or can be split at a '/' into a prefix <= + * 155 chars and a suffix <= 100 chars. (Note + * the strchr() above will return NULL exactly + * when the path can't be split.) + */ + if (suffix == NULL /* Suffix > 100 chars. */ + || suffix[1] == '\0' /* empty suffix */ + || suffix - path > 155) /* Prefix > 155 chars */ + { + if (path_w == NULL || hdrcharset != NULL) { + /* Can't do UTF-8, so store it raw. */ + add_pax_attr(&(pax->pax_header), + "path", path); + } else { + /* Store UTF-8 */ + add_pax_attr_w(&(pax->pax_header), + "path", path_w); + } + archive_entry_set_pathname(entry_main, + build_ustar_entry_name(ustar_entry_name, + path, strlen(path), NULL)); + need_extension = 1; + } + } + } + + if (linkpath != NULL) { + /* If link name is too long or has non-ASCII characters, add + * 'linkpath' to pax extended attrs. */ + if (strlen(linkpath) > 100 || linkpath_w == NULL + || linkpath_w == NULL || has_non_ASCII(linkpath_w)) { + if (linkpath_w == NULL || hdrcharset != NULL) + /* If the linkpath is not convertible + * to wide, or we're encoding in + * binary anyway, store it raw. */ + add_pax_attr(&(pax->pax_header), + "linkpath", linkpath); + else + /* If the link is long or has a + * non-ASCII character, store it as a + * pax extended attribute. */ + add_pax_attr_w(&(pax->pax_header), + "linkpath", linkpath_w); + if (strlen(linkpath) > 100) { + if (hardlink != NULL) + archive_entry_set_hardlink(entry_main, + "././@LongHardLink"); + else + archive_entry_set_symlink(entry_main, + "././@LongSymLink"); + } + need_extension = 1; + } + } + +#ifndef __minix + /* If file size is too large, add 'size' to pax extended attrs. */ + if (archive_entry_size(entry_main) >= (((int64_t)1) << 33)) { + add_pax_attr_int(&(pax->pax_header), "size", + archive_entry_size(entry_main)); + need_extension = 1; + } +#else + /* This is not happening on Minix, as the size of anything + * on minix fits in 32 bits + */ +#endif + + /* If numeric GID is too large, add 'gid' to pax extended attrs. */ + if ((unsigned int)archive_entry_gid(entry_main) >= (1 << 18)) { + add_pax_attr_int(&(pax->pax_header), "gid", + archive_entry_gid(entry_main)); + need_extension = 1; + } + + /* If group name is too large or has non-ASCII characters, add + * 'gname' to pax extended attrs. */ + if (gname != NULL) { + if (strlen(gname) > 31 + || gname_w == NULL + || has_non_ASCII(gname_w)) + { + if (gname_w == NULL || hdrcharset != NULL) { + add_pax_attr(&(pax->pax_header), + "gname", gname); + } else { + add_pax_attr_w(&(pax->pax_header), + "gname", gname_w); + } + need_extension = 1; + } + } + + /* If numeric UID is too large, add 'uid' to pax extended attrs. */ + if ((unsigned int)archive_entry_uid(entry_main) >= (1 << 18)) { + add_pax_attr_int(&(pax->pax_header), "uid", + archive_entry_uid(entry_main)); + need_extension = 1; + } + + /* Add 'uname' to pax extended attrs if necessary. */ + if (uname != NULL) { + if (strlen(uname) > 31 + || uname_w == NULL + || has_non_ASCII(uname_w)) + { + if (uname_w == NULL || hdrcharset != NULL) { + add_pax_attr(&(pax->pax_header), + "uname", uname); + } else { + add_pax_attr_w(&(pax->pax_header), + "uname", uname_w); + } + need_extension = 1; + } + } + + /* + * POSIX/SUSv3 doesn't provide a standard key for large device + * numbers. I use the same keys here that Joerg Schilling + * used for 'star.' (Which, somewhat confusingly, are called + * "devXXX" even though they code "rdev" values.) No doubt, + * other implementations use other keys. Note that there's no + * reason we can't write the same information into a number of + * different keys. + * + * Of course, this is only needed for block or char device entries. + */ + if (archive_entry_filetype(entry_main) == AE_IFBLK + || archive_entry_filetype(entry_main) == AE_IFCHR) { + /* + * If rdevmajor is too large, add 'SCHILY.devmajor' to + * extended attributes. + */ + dev_t rdevmajor, rdevminor; + rdevmajor = archive_entry_rdevmajor(entry_main); + rdevminor = archive_entry_rdevminor(entry_main); + if (rdevmajor >= (1 << 18)) { + add_pax_attr_int(&(pax->pax_header), "SCHILY.devmajor", + rdevmajor); + /* + * Non-strict formatting below means we don't + * have to truncate here. Not truncating improves + * the chance that some more modern tar archivers + * (such as GNU tar 1.13) can restore the full + * value even if they don't understand the pax + * extended attributes. See my rant below about + * file size fields for additional details. + */ + /* archive_entry_set_rdevmajor(entry_main, + rdevmajor & ((1 << 18) - 1)); */ + need_extension = 1; + } + + /* + * If devminor is too large, add 'SCHILY.devminor' to + * extended attributes. + */ + if (rdevminor >= (1 << 18)) { + add_pax_attr_int(&(pax->pax_header), "SCHILY.devminor", + rdevminor); + /* Truncation is not necessary here, either. */ + /* archive_entry_set_rdevminor(entry_main, + rdevminor & ((1 << 18) - 1)); */ + need_extension = 1; + } + } + + /* + * Technically, the mtime field in the ustar header can + * support 33 bits, but many platforms use signed 32-bit time + * values. The cutoff of 0x7fffffff here is a compromise. + * Yes, this check is duplicated just below; this helps to + * avoid writing an mtime attribute just to handle a + * high-resolution timestamp in "restricted pax" mode. + */ + if (!need_extension && + ((archive_entry_mtime(entry_main) < 0) + || (archive_entry_mtime(entry_main) >= 0x7fffffff))) + need_extension = 1; + + /* I use a star-compatible file flag attribute. */ + p = archive_entry_fflags_text(entry_main); + if (!need_extension && p != NULL && *p != '\0') + need_extension = 1; + + /* If there are non-trivial ACL entries, we need an extension. */ + if (!need_extension && archive_entry_acl_count(entry_original, + ARCHIVE_ENTRY_ACL_TYPE_ACCESS) > 0) + need_extension = 1; + + /* If there are non-trivial ACL entries, we need an extension. */ + if (!need_extension && archive_entry_acl_count(entry_original, + ARCHIVE_ENTRY_ACL_TYPE_DEFAULT) > 0) + need_extension = 1; + + /* If there are extended attributes, we need an extension */ + if (!need_extension && archive_entry_xattr_count(entry_original) > 0) + need_extension = 1; + + /* + * The following items are handled differently in "pax + * restricted" format. In particular, in "pax restricted" + * format they won't be added unless need_extension is + * already set (we're already generating an extended header, so + * may as well include these). + */ + if (a->archive.archive_format != ARCHIVE_FORMAT_TAR_PAX_RESTRICTED || + need_extension) { + + if (archive_entry_mtime(entry_main) < 0 || + archive_entry_mtime(entry_main) >= 0x7fffffff || + archive_entry_mtime_nsec(entry_main) != 0) + add_pax_attr_time(&(pax->pax_header), "mtime", + archive_entry_mtime(entry_main), + archive_entry_mtime_nsec(entry_main)); + + if (archive_entry_ctime(entry_main) != 0 || + archive_entry_ctime_nsec(entry_main) != 0) + add_pax_attr_time(&(pax->pax_header), "ctime", + archive_entry_ctime(entry_main), + archive_entry_ctime_nsec(entry_main)); + + if (archive_entry_atime(entry_main) != 0 || + archive_entry_atime_nsec(entry_main) != 0) + add_pax_attr_time(&(pax->pax_header), "atime", + archive_entry_atime(entry_main), + archive_entry_atime_nsec(entry_main)); + + /* Store birth/creationtime only if it's earlier than mtime */ + if (archive_entry_birthtime_is_set(entry_main) && + archive_entry_birthtime(entry_main) + < archive_entry_mtime(entry_main)) + add_pax_attr_time(&(pax->pax_header), + "LIBARCHIVE.creationtime", + archive_entry_birthtime(entry_main), + archive_entry_birthtime_nsec(entry_main)); + + /* I use a star-compatible file flag attribute. */ + p = archive_entry_fflags_text(entry_main); + if (p != NULL && *p != '\0') + add_pax_attr(&(pax->pax_header), "SCHILY.fflags", p); + + /* I use star-compatible ACL attributes. */ + wp = archive_entry_acl_text_w(entry_original, + ARCHIVE_ENTRY_ACL_TYPE_ACCESS | + ARCHIVE_ENTRY_ACL_STYLE_EXTRA_ID); + if (wp != NULL && *wp != L'\0') + add_pax_attr_w(&(pax->pax_header), + "SCHILY.acl.access", wp); + wp = archive_entry_acl_text_w(entry_original, + ARCHIVE_ENTRY_ACL_TYPE_DEFAULT | + ARCHIVE_ENTRY_ACL_STYLE_EXTRA_ID); + if (wp != NULL && *wp != L'\0') + add_pax_attr_w(&(pax->pax_header), + "SCHILY.acl.default", wp); + + /* Include star-compatible metadata info. */ + /* Note: "SCHILY.dev{major,minor}" are NOT the + * major/minor portions of "SCHILY.dev". */ + add_pax_attr_int(&(pax->pax_header), "SCHILY.dev", + archive_entry_dev(entry_main)); +#ifndef __minix + add_pax_attr_int(&(pax->pax_header), "SCHILY.ino", + archive_entry_ino64(entry_main)); +#else + add_pax_attr_int(&(pax->pax_header), "SCHILY.ino", + archive_entry_ino(entry_main)); +#endif + add_pax_attr_int(&(pax->pax_header), "SCHILY.nlink", + archive_entry_nlink(entry_main)); + + /* Store extended attributes */ + archive_write_pax_header_xattrs(pax, entry_original); + } + + /* Only regular files have data. */ + if (archive_entry_filetype(entry_main) != AE_IFREG) + archive_entry_set_size(entry_main, 0); + + /* + * Pax-restricted does not store data for hardlinks, in order + * to improve compatibility with ustar. + */ + if (a->archive.archive_format != ARCHIVE_FORMAT_TAR_PAX_INTERCHANGE && + hardlink != NULL) + archive_entry_set_size(entry_main, 0); + + /* + * XXX Full pax interchange format does permit a hardlink + * entry to have data associated with it. I'm not supporting + * that here because the client expects me to tell them whether + * or not this format expects data for hardlinks. If I + * don't check here, then every pax archive will end up with + * duplicated data for hardlinks. Someday, there may be + * need to select this behavior, in which case the following + * will need to be revisited. XXX + */ + if (hardlink != NULL) + archive_entry_set_size(entry_main, 0); + + /* Format 'ustar' header for main entry. + * + * The trouble with file size: If the reader can't understand + * the file size, they may not be able to locate the next + * entry and the rest of the archive is toast. Pax-compliant + * readers are supposed to ignore the file size in the main + * header, so the question becomes how to maximize portability + * for readers that don't support pax attribute extensions. + * For maximum compatibility, I permit numeric extensions in + * the main header so that the file size stored will always be + * correct, even if it's in a format that only some + * implementations understand. The technique used here is: + * + * a) If possible, follow the standard exactly. This handles + * files up to 8 gigabytes minus 1. + * + * b) If that fails, try octal but omit the field terminator. + * That handles files up to 64 gigabytes minus 1. + * + * c) Otherwise, use base-256 extensions. That handles files + * up to 2^63 in this implementation, with the potential to + * go up to 2^94. That should hold us for a while. ;-) + * + * The non-strict formatter uses similar logic for other + * numeric fields, though they're less critical. + */ + __archive_write_format_header_ustar(a, ustarbuff, entry_main, -1, 0); + + /* If we built any extended attributes, write that entry first. */ + if (archive_strlen(&(pax->pax_header)) > 0) { + struct archive_entry *pax_attr_entry; + time_t s; + uid_t uid; + gid_t gid; + mode_t mode; + + pax_attr_entry = archive_entry_new(); + p = archive_entry_pathname(entry_main); + archive_entry_set_pathname(pax_attr_entry, + build_pax_attribute_name(pax_entry_name, p)); + archive_entry_set_size(pax_attr_entry, + archive_strlen(&(pax->pax_header))); + /* Copy uid/gid (but clip to ustar limits). */ + uid = archive_entry_uid(entry_main); +#ifndef __minix /* Does not happen on minix as sizeof(uid_t) == 2 */ + if ((unsigned int)uid >= 1 << 18) + uid = (uid_t)(1 << 18) - 1; +#endif + archive_entry_set_uid(pax_attr_entry, uid); + gid = archive_entry_gid(entry_main); +#ifndef __minix /* Does not happen on minix as sizeof(gid_t) == 1 */ + if ((unsigned int)gid >= 1 << 18) + gid = (gid_t)(1 << 18) - 1; +#endif + archive_entry_set_gid(pax_attr_entry, gid); + /* Copy mode over (but not setuid/setgid bits) */ + mode = archive_entry_mode(entry_main); +#ifdef S_ISUID + mode &= ~S_ISUID; +#endif +#ifdef S_ISGID + mode &= ~S_ISGID; +#endif +#ifdef S_ISVTX + mode &= ~S_ISVTX; +#endif + archive_entry_set_mode(pax_attr_entry, mode); + + /* Copy uname/gname. */ + archive_entry_set_uname(pax_attr_entry, + archive_entry_uname(entry_main)); + archive_entry_set_gname(pax_attr_entry, + archive_entry_gname(entry_main)); + + /* Copy mtime, but clip to ustar limits. */ + s = archive_entry_mtime(entry_main); + if (s < 0) { s = 0; } + if (s >= 0x7fffffff) { s = 0x7fffffff; } + archive_entry_set_mtime(pax_attr_entry, s, 0); + + /* Standard ustar doesn't support atime. */ + archive_entry_set_atime(pax_attr_entry, 0, 0); + + /* Standard ustar doesn't support ctime. */ + archive_entry_set_ctime(pax_attr_entry, 0, 0); + + r = __archive_write_format_header_ustar(a, paxbuff, + pax_attr_entry, 'x', 1); + + archive_entry_free(pax_attr_entry); + + /* Note that the 'x' header shouldn't ever fail to format */ + if (r != 0) { + const char *msg = "archive_write_pax_header: " + "'x' header failed?! This can't happen.\n"; + size_t u = write(2, msg, strlen(msg)); + (void)u; /* UNUSED */ + exit(1); + } + r = (a->compressor.write)(a, paxbuff, 512); + if (r != ARCHIVE_OK) { + pax->entry_bytes_remaining = 0; + pax->entry_padding = 0; + return (ARCHIVE_FATAL); + } + + pax->entry_bytes_remaining = archive_strlen(&(pax->pax_header)); +#ifndef __minix + pax->entry_padding = 0x1ff & (-(int64_t)pax->entry_bytes_remaining); +#else + pax->entry_padding = 0x1ff & (-(int32_t)pax->entry_bytes_remaining); +#endif + r = (a->compressor.write)(a, pax->pax_header.s, + archive_strlen(&(pax->pax_header))); + if (r != ARCHIVE_OK) { + /* If a write fails, we're pretty much toast. */ + return (ARCHIVE_FATAL); + } + /* Pad out the end of the entry. */ + r = write_nulls(a, pax->entry_padding); + if (r != ARCHIVE_OK) { + /* If a write fails, we're pretty much toast. */ + return (ARCHIVE_FATAL); + } + pax->entry_bytes_remaining = pax->entry_padding = 0; + } + + /* Write the header for main entry. */ + r = (a->compressor.write)(a, ustarbuff, 512); + if (r != ARCHIVE_OK) + return (r); + + /* + * Inform the client of the on-disk size we're using, so + * they can avoid unnecessarily writing a body for something + * that we're just going to ignore. + */ + archive_entry_set_size(entry_original, archive_entry_size(entry_main)); + pax->entry_bytes_remaining = archive_entry_size(entry_main); +#ifndef __minix + pax->entry_padding = 0x1ff & (-(int64_t)pax->entry_bytes_remaining); +#else + pax->entry_padding = 0x1ff & (-(int32_t)pax->entry_bytes_remaining); +#endif + archive_entry_free(entry_main); + + return (ret); +} + +/* + * We need a valid name for the regular 'ustar' entry. This routine + * tries to hack something more-or-less reasonable. + * + * The approach here tries to preserve leading dir names. We do so by + * working with four sections: + * 1) "prefix" directory names, + * 2) "suffix" directory names, + * 3) inserted dir name (optional), + * 4) filename. + * + * These sections must satisfy the following requirements: + * * Parts 1 & 2 together form an initial portion of the dir name. + * * Part 3 is specified by the caller. (It should not contain a leading + * or trailing '/'.) + * * Part 4 forms an initial portion of the base filename. + * * The filename must be <= 99 chars to fit the ustar 'name' field. + * * Parts 2, 3, 4 together must be <= 99 chars to fit the ustar 'name' fld. + * * Part 1 must be <= 155 chars to fit the ustar 'prefix' field. + * * If the original name ends in a '/', the new name must also end in a '/' + * * Trailing '/.' sequences may be stripped. + * + * Note: Recall that the ustar format does not store the '/' separating + * parts 1 & 2, but does store the '/' separating parts 2 & 3. + */ +static char * +build_ustar_entry_name(char *dest, const char *src, size_t src_length, + const char *insert) +{ + const char *prefix, *prefix_end; + const char *suffix, *suffix_end; + const char *filename, *filename_end; + char *p; + int need_slash = 0; /* Was there a trailing slash? */ + size_t suffix_length = 99; + size_t insert_length; + + /* Length of additional dir element to be added. */ + if (insert == NULL) + insert_length = 0; + else + /* +2 here allows for '/' before and after the insert. */ + insert_length = strlen(insert) + 2; + + /* Step 0: Quick bailout in a common case. */ + if (src_length < 100 && insert == NULL) { + strncpy(dest, src, src_length); + dest[src_length] = '\0'; + return (dest); + } + + /* Step 1: Locate filename and enforce the length restriction. */ + filename_end = src + src_length; + /* Remove trailing '/' chars and '/.' pairs. */ + for (;;) { + if (filename_end > src && filename_end[-1] == '/') { + filename_end --; + need_slash = 1; /* Remember to restore trailing '/'. */ + continue; + } + if (filename_end > src + 1 && filename_end[-1] == '.' + && filename_end[-2] == '/') { + filename_end -= 2; + need_slash = 1; /* "foo/." will become "foo/" */ + continue; + } + break; + } + if (need_slash) + suffix_length--; + /* Find start of filename. */ + filename = filename_end - 1; + while ((filename > src) && (*filename != '/')) + filename --; + if ((*filename == '/') && (filename < filename_end - 1)) + filename ++; + /* Adjust filename_end so that filename + insert fits in 99 chars. */ + suffix_length -= insert_length; + if (filename_end > filename + suffix_length) + filename_end = filename + suffix_length; + /* Calculate max size for "suffix" section (#3 above). */ + suffix_length -= filename_end - filename; + + /* Step 2: Locate the "prefix" section of the dirname, including + * trailing '/'. */ + prefix = src; + prefix_end = prefix + 155; + if (prefix_end > filename) + prefix_end = filename; + while (prefix_end > prefix && *prefix_end != '/') + prefix_end--; + if ((prefix_end < filename) && (*prefix_end == '/')) + prefix_end++; + + /* Step 3: Locate the "suffix" section of the dirname, + * including trailing '/'. */ + suffix = prefix_end; + suffix_end = suffix + suffix_length; /* Enforce limit. */ + if (suffix_end > filename) + suffix_end = filename; + if (suffix_end < suffix) + suffix_end = suffix; + while (suffix_end > suffix && *suffix_end != '/') + suffix_end--; + if ((suffix_end < filename) && (*suffix_end == '/')) + suffix_end++; + + /* Step 4: Build the new name. */ + /* The OpenBSD strlcpy function is safer, but less portable. */ + /* Rather than maintain two versions, just use the strncpy version. */ + p = dest; + if (prefix_end > prefix) { + strncpy(p, prefix, prefix_end - prefix); + p += prefix_end - prefix; + } + if (suffix_end > suffix) { + strncpy(p, suffix, suffix_end - suffix); + p += suffix_end - suffix; + } + if (insert != NULL) { + /* Note: assume insert does not have leading or trailing '/' */ + strcpy(p, insert); + p += strlen(insert); + *p++ = '/'; + } + strncpy(p, filename, filename_end - filename); + p += filename_end - filename; + if (need_slash) + *p++ = '/'; + *p = '\0'; + + return (dest); +} + +/* + * The ustar header for the pax extended attributes must have a + * reasonable name: SUSv3 requires 'dirname'/PaxHeader.'pid'/'filename' + * where 'pid' is the PID of the archiving process. Unfortunately, + * that makes testing a pain since the output varies for each run, + * so I'm sticking with the simpler 'dirname'/PaxHeader/'filename' + * for now. (Someday, I'll make this settable. Then I can use the + * SUS recommendation as default and test harnesses can override it + * to get predictable results.) + * + * Joerg Schilling has argued that this is unnecessary because, in + * practice, if the pax extended attributes get extracted as regular + * files, noone is going to bother reading those attributes to + * manually restore them. Based on this, 'star' uses + * /tmp/PaxHeader/'basename' as the ustar header name. This is a + * tempting argument, in part because it's simpler than the SUSv3 + * recommendation, but I'm not entirely convinced. I'm also + * uncomfortable with the fact that "/tmp" is a Unix-ism. + * + * The following routine leverages build_ustar_entry_name() above and + * so is simpler than you might think. It just needs to provide the + * additional path element and handle a few pathological cases). + */ +static char * +build_pax_attribute_name(char *dest, const char *src) +{ + char buff[64]; + const char *p; + + /* Handle the null filename case. */ + if (src == NULL || *src == '\0') { + strcpy(dest, "PaxHeader/blank"); + return (dest); + } + + /* Prune final '/' and other unwanted final elements. */ + p = src + strlen(src); + for (;;) { + /* Ends in "/", remove the '/' */ + if (p > src && p[-1] == '/') { + --p; + continue; + } + /* Ends in "/.", remove the '.' */ + if (p > src + 1 && p[-1] == '.' + && p[-2] == '/') { + --p; + continue; + } + break; + } + + /* Pathological case: After above, there was nothing left. + * This includes "/." "/./." "/.//./." etc. */ + if (p == src) { + strcpy(dest, "/PaxHeader/rootdir"); + return (dest); + } + + /* Convert unadorned "." into a suitable filename. */ + if (*src == '.' && p == src + 1) { + strcpy(dest, "PaxHeader/currentdir"); + return (dest); + } + + /* + * TODO: Push this string into the 'pax' structure to avoid + * recomputing it every time. That will also open the door + * to having clients override it. + */ +#if HAVE_GETPID && 0 /* Disable this for now; see above comment. */ + sprintf(buff, "PaxHeader.%d", getpid()); +#else + /* If the platform can't fetch the pid, don't include it. */ + strcpy(buff, "PaxHeader"); +#endif + /* General case: build a ustar-compatible name adding "/PaxHeader/". */ + build_ustar_entry_name(dest, src, p - src, buff); + + return (dest); +} + +/* Write two null blocks for the end of archive */ +static int +archive_write_pax_finish(struct archive_write *a) +{ + int r; + + if (a->compressor.write == NULL) + return (ARCHIVE_OK); + + r = write_nulls(a, 512 * 2); + return (r); +} + +static int +archive_write_pax_destroy(struct archive_write *a) +{ + struct pax *pax; + + pax = (struct pax *)a->format_data; + if (pax == NULL) + return (ARCHIVE_OK); + + archive_string_free(&pax->pax_header); + free(pax); + a->format_data = NULL; + return (ARCHIVE_OK); +} + +static int +archive_write_pax_finish_entry(struct archive_write *a) +{ + struct pax *pax; + int ret; + + pax = (struct pax *)a->format_data; + ret = write_nulls(a, pax->entry_bytes_remaining + pax->entry_padding); + pax->entry_bytes_remaining = pax->entry_padding = 0; + return (ret); +} + +static int +write_nulls(struct archive_write *a, size_t padding) +{ + int ret; + size_t to_write; + + while (padding > 0) { + to_write = padding < a->null_length ? padding : a->null_length; + ret = (a->compressor.write)(a, a->nulls, to_write); + if (ret != ARCHIVE_OK) + return (ret); + padding -= to_write; + } + return (ARCHIVE_OK); +} + +static ssize_t +archive_write_pax_data(struct archive_write *a, const void *buff, size_t s) +{ + struct pax *pax; + int ret; + + pax = (struct pax *)a->format_data; + if (s > pax->entry_bytes_remaining) + s = pax->entry_bytes_remaining; + + ret = (a->compressor.write)(a, buff, s); + pax->entry_bytes_remaining -= s; + if (ret == ARCHIVE_OK) + return (s); + else + return (ret); +} + +static int +has_non_ASCII(const wchar_t *wp) +{ + if (wp == NULL) + return (1); + while (*wp != L'\0' && *wp < 128) + wp++; + return (*wp != L'\0'); +} + +/* + * Used by extended attribute support; encodes the name + * so that there will be no '=' characters in the result. + */ +static char * +url_encode(const char *in) +{ + const char *s; + char *d; + int out_len = 0; + char *out; + + for (s = in; *s != '\0'; s++) { + if (*s < 33 || *s > 126 || *s == '%' || *s == '=') + out_len += 3; + else + out_len++; + } + + out = (char *)malloc(out_len + 1); + if (out == NULL) + return (NULL); + + for (s = in, d = out; *s != '\0'; s++) { + /* encode any non-printable ASCII character or '%' or '=' */ + if (*s < 33 || *s > 126 || *s == '%' || *s == '=') { + /* URL encoding is '%' followed by two hex digits */ + *d++ = '%'; + *d++ = "0123456789ABCDEF"[0x0f & (*s >> 4)]; + *d++ = "0123456789ABCDEF"[0x0f & *s]; + } else { + *d++ = *s; + } + } + *d = '\0'; + return (out); +} + +/* + * Encode a sequence of bytes into a C string using base-64 encoding. + * + * Returns a null-terminated C string allocated with malloc(); caller + * is responsible for freeing the result. + */ +static char * +base64_encode(const char *s, size_t len) +{ + static const char digits[64] = + { 'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O', + 'P','Q','R','S','T','U','V','W','X','Y','Z','a','b','c','d', + 'e','f','g','h','i','j','k','l','m','n','o','p','q','r','s', + 't','u','v','w','x','y','z','0','1','2','3','4','5','6','7', + '8','9','+','/' }; + int v; + char *d, *out; + + /* 3 bytes becomes 4 chars, but round up and allow for trailing NUL */ + out = (char *)malloc((len * 4 + 2) / 3 + 1); + if (out == NULL) + return (NULL); + d = out; + + /* Convert each group of 3 bytes into 4 characters. */ + while (len >= 3) { + v = (((int)s[0] << 16) & 0xff0000) + | (((int)s[1] << 8) & 0xff00) + | (((int)s[2]) & 0x00ff); + s += 3; + len -= 3; + *d++ = digits[(v >> 18) & 0x3f]; + *d++ = digits[(v >> 12) & 0x3f]; + *d++ = digits[(v >> 6) & 0x3f]; + *d++ = digits[(v) & 0x3f]; + } + /* Handle final group of 1 byte (2 chars) or 2 bytes (3 chars). */ + switch (len) { + case 0: break; + case 1: + v = (((int)s[0] << 16) & 0xff0000); + *d++ = digits[(v >> 18) & 0x3f]; + *d++ = digits[(v >> 12) & 0x3f]; + break; + case 2: + v = (((int)s[0] << 16) & 0xff0000) + | (((int)s[1] << 8) & 0xff00); + *d++ = digits[(v >> 18) & 0x3f]; + *d++ = digits[(v >> 12) & 0x3f]; + *d++ = digits[(v >> 6) & 0x3f]; + break; + } + /* Add trailing NUL character so output is a valid C string. */ + *d = '\0'; + return (out); +} diff --git a/lib/libarchive/archive_write_set_format_shar.c b/lib/libarchive/archive_write_set_format_shar.c new file mode 100644 index 000000000..62a875b98 --- /dev/null +++ b/lib/libarchive/archive_write_set_format_shar.c @@ -0,0 +1,626 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * Copyright (c) 2008 Joerg Sonnenberger + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_write_set_format_shar.c 189438 2009-03-06 05:58:56Z kientzle $"); + +#ifdef HAVE_ERRNO_H +#include +#endif +#include +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" +#include "archive_write_private.h" + +struct shar { + int dump; + int end_of_line; + struct archive_entry *entry; + int has_data; + char *last_dir; + + /* Line buffer for uuencoded dump format */ + char outbuff[45]; + size_t outpos; + + int wrote_header; + struct archive_string work; + struct archive_string quoted_name; +}; + +static int archive_write_shar_finish(struct archive_write *); +static int archive_write_shar_destroy(struct archive_write *); +static int archive_write_shar_header(struct archive_write *, + struct archive_entry *); +static ssize_t archive_write_shar_data_sed(struct archive_write *, + const void * buff, size_t); +static ssize_t archive_write_shar_data_uuencode(struct archive_write *, + const void * buff, size_t); +static int archive_write_shar_finish_entry(struct archive_write *); + +/* + * Copy the given string to the buffer, quoting all shell meta characters + * found. + */ +static void +shar_quote(struct archive_string *buf, const char *str, int in_shell) +{ + static const char meta[] = "\n \t'`\";&<>()|*?{}[]\\$!#^~"; + size_t len; + + while (*str != '\0') { + if ((len = strcspn(str, meta)) != 0) { + archive_strncat(buf, str, len); + str += len; + } else if (*str == '\n') { + if (in_shell) + archive_strcat(buf, "\"\n\""); + else + archive_strcat(buf, "\\n"); + ++str; + } else { + archive_strappend_char(buf, '\\'); + archive_strappend_char(buf, *str); + ++str; + } + } +} + +/* + * Set output format to 'shar' format. + */ +int +archive_write_set_format_shar(struct archive *_a) +{ + struct archive_write *a = (struct archive_write *)_a; + struct shar *shar; + + /* If someone else was already registered, unregister them. */ + if (a->format_destroy != NULL) + (a->format_destroy)(a); + + shar = (struct shar *)malloc(sizeof(*shar)); + if (shar == NULL) { + archive_set_error(&a->archive, ENOMEM, "Can't allocate shar data"); + return (ARCHIVE_FATAL); + } + memset(shar, 0, sizeof(*shar)); + archive_string_init(&shar->work); + archive_string_init(&shar->quoted_name); + a->format_data = shar; + + a->pad_uncompressed = 0; + a->format_name = "shar"; + a->format_write_header = archive_write_shar_header; + a->format_finish = archive_write_shar_finish; + a->format_destroy = archive_write_shar_destroy; + a->format_write_data = archive_write_shar_data_sed; + a->format_finish_entry = archive_write_shar_finish_entry; + a->archive.archive_format = ARCHIVE_FORMAT_SHAR_BASE; + a->archive.archive_format_name = "shar"; + return (ARCHIVE_OK); +} + +/* + * An alternate 'shar' that uses uudecode instead of 'sed' to encode + * file contents and can therefore be used to archive binary files. + * In addition, this variant also attempts to restore ownership, file modes, + * and other extended file information. + */ +int +archive_write_set_format_shar_dump(struct archive *_a) +{ + struct archive_write *a = (struct archive_write *)_a; + struct shar *shar; + + archive_write_set_format_shar(&a->archive); + shar = (struct shar *)a->format_data; + shar->dump = 1; + a->format_write_data = archive_write_shar_data_uuencode; + a->archive.archive_format = ARCHIVE_FORMAT_SHAR_DUMP; + a->archive.archive_format_name = "shar dump"; + return (ARCHIVE_OK); +} + +static int +archive_write_shar_header(struct archive_write *a, struct archive_entry *entry) +{ + const char *linkname; + const char *name; + char *p, *pp; + struct shar *shar; + + shar = (struct shar *)a->format_data; + if (!shar->wrote_header) { + archive_strcat(&shar->work, "#!/bin/sh\n"); + archive_strcat(&shar->work, "# This is a shell archive\n"); + shar->wrote_header = 1; + } + + /* Save the entry for the closing. */ + if (shar->entry) + archive_entry_free(shar->entry); + shar->entry = archive_entry_clone(entry); + name = archive_entry_pathname(entry); + + /* Handle some preparatory issues. */ + switch(archive_entry_filetype(entry)) { + case AE_IFREG: + /* Only regular files have non-zero size. */ + break; + case AE_IFDIR: + archive_entry_set_size(entry, 0); + /* Don't bother trying to recreate '.' */ + if (strcmp(name, ".") == 0 || strcmp(name, "./") == 0) + return (ARCHIVE_OK); + break; + case AE_IFIFO: + case AE_IFCHR: + case AE_IFBLK: + /* All other file types have zero size in the archive. */ + archive_entry_set_size(entry, 0); + break; + default: + archive_entry_set_size(entry, 0); + if (archive_entry_hardlink(entry) == NULL && + archive_entry_symlink(entry) == NULL) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "shar format cannot archive this"); + return (ARCHIVE_WARN); + } + } + + archive_string_empty(&shar->quoted_name); + shar_quote(&shar->quoted_name, name, 1); + + /* Stock preparation for all file types. */ + archive_string_sprintf(&shar->work, "echo x %s\n", shar->quoted_name.s); + + if (archive_entry_filetype(entry) != AE_IFDIR) { + /* Try to create the dir. */ + p = strdup(name); + pp = strrchr(p, '/'); + /* If there is a / character, try to create the dir. */ + if (pp != NULL) { + *pp = '\0'; + + /* Try to avoid a lot of redundant mkdir commands. */ + if (strcmp(p, ".") == 0) { + /* Don't try to "mkdir ." */ + free(p); + } else if (shar->last_dir == NULL) { + archive_strcat(&shar->work, "mkdir -p "); + shar_quote(&shar->work, p, 1); + archive_strcat(&shar->work, + " > /dev/null 2>&1\n"); + shar->last_dir = p; + } else if (strcmp(p, shar->last_dir) == 0) { + /* We've already created this exact dir. */ + free(p); + } else if (strlen(p) < strlen(shar->last_dir) && + strncmp(p, shar->last_dir, strlen(p)) == 0) { + /* We've already created a subdir. */ + free(p); + } else { + archive_strcat(&shar->work, "mkdir -p "); + shar_quote(&shar->work, p, 1); + archive_strcat(&shar->work, + " > /dev/null 2>&1\n"); + shar->last_dir = p; + } + } else { + free(p); + } + } + + /* Handle file-type specific issues. */ + shar->has_data = 0; + if ((linkname = archive_entry_hardlink(entry)) != NULL) { + archive_strcat(&shar->work, "ln -f "); + shar_quote(&shar->work, linkname, 1); + archive_string_sprintf(&shar->work, " %s\n", + shar->quoted_name.s); + } else if ((linkname = archive_entry_symlink(entry)) != NULL) { + archive_strcat(&shar->work, "ln -fs "); + shar_quote(&shar->work, linkname, 1); + archive_string_sprintf(&shar->work, " %s\n", + shar->quoted_name.s); + } else { + switch(archive_entry_filetype(entry)) { + case AE_IFREG: + if (archive_entry_size(entry) == 0) { + /* More portable than "touch." */ + archive_string_sprintf(&shar->work, + "test -e \"%s\" || :> \"%s\"\n", + shar->quoted_name.s, shar->quoted_name.s); + } else { + if (shar->dump) { + archive_string_sprintf(&shar->work, + "uudecode -p > %s << 'SHAR_END'\n", + shar->quoted_name.s); + archive_string_sprintf(&shar->work, + "begin %o ", + archive_entry_mode(entry) & 0777); + shar_quote(&shar->work, name, 0); + archive_strcat(&shar->work, "\n"); + } else { + archive_string_sprintf(&shar->work, + "sed 's/^X//' > %s << 'SHAR_END'\n", + shar->quoted_name.s); + } + shar->has_data = 1; + shar->end_of_line = 1; + shar->outpos = 0; + } + break; + case AE_IFDIR: + archive_string_sprintf(&shar->work, + "mkdir -p %s > /dev/null 2>&1\n", + shar->quoted_name.s); + /* Record that we just created this directory. */ + if (shar->last_dir != NULL) + free(shar->last_dir); + + shar->last_dir = strdup(name); + /* Trim a trailing '/'. */ + pp = strrchr(shar->last_dir, '/'); + if (pp != NULL && pp[1] == '\0') + *pp = '\0'; + /* + * TODO: Put dir name/mode on a list to be fixed + * up at end of archive. + */ + break; + case AE_IFIFO: + archive_string_sprintf(&shar->work, + "mkfifo %s\n", shar->quoted_name.s); + break; + case AE_IFCHR: + archive_string_sprintf(&shar->work, + "mknod %s c %d %d\n", shar->quoted_name.s, + archive_entry_rdevmajor(entry), + archive_entry_rdevminor(entry)); + break; + case AE_IFBLK: + archive_string_sprintf(&shar->work, + "mknod %s b %d %d\n", shar->quoted_name.s, + archive_entry_rdevmajor(entry), + archive_entry_rdevminor(entry)); + break; + default: + return (ARCHIVE_WARN); + } + } + + return (ARCHIVE_OK); +} + +static ssize_t +archive_write_shar_data_sed(struct archive_write *a, const void *buff, size_t n) +{ + static const size_t ensured = 65533; + struct shar *shar; + const char *src; + char *buf, *buf_end; + int ret; + size_t written = n; + + shar = (struct shar *)a->format_data; + if (!shar->has_data || n == 0) + return (0); + + src = (const char *)buff; + + /* + * ensure is the number of bytes in buffer before expanding the + * current character. Each operation writes the current character + * and optionally the start-of-new-line marker. This can happen + * twice before entering the loop, so make sure three additional + * bytes can be written. + */ + if (archive_string_ensure(&shar->work, ensured + 3) == NULL) + __archive_errx(1, "Out of memory"); + + if (shar->work.length > ensured) { + ret = (*a->compressor.write)(a, shar->work.s, + shar->work.length); + if (ret != ARCHIVE_OK) + return (ARCHIVE_FATAL); + archive_string_empty(&shar->work); + } + buf = shar->work.s + shar->work.length; + buf_end = shar->work.s + ensured; + + if (shar->end_of_line) { + *buf++ = 'X'; + shar->end_of_line = 0; + } + + while (n-- != 0) { + if ((*buf++ = *src++) == '\n') { + if (n == 0) + shar->end_of_line = 1; + else + *buf++ = 'X'; + } + + if (buf >= buf_end) { + shar->work.length = buf - shar->work.s; + ret = (*a->compressor.write)(a, shar->work.s, + shar->work.length); + if (ret != ARCHIVE_OK) + return (ARCHIVE_FATAL); + archive_string_empty(&shar->work); + buf = shar->work.s; + } + } + + shar->work.length = buf - shar->work.s; + + return (written); +} + +#define UUENC(c) (((c)!=0) ? ((c) & 077) + ' ': '`') + +static void +uuencode_group(const char _in[3], char out[4]) +{ + const unsigned char *in = (const unsigned char *)_in; + int t; + + t = (in[0] << 16) | (in[1] << 8) | in[2]; + out[0] = UUENC( 0x3f & (t >> 18) ); + out[1] = UUENC( 0x3f & (t >> 12) ); + out[2] = UUENC( 0x3f & (t >> 6) ); + out[3] = UUENC( 0x3f & t ); +} + +static void +uuencode_line(struct shar *shar, const char *inbuf, size_t len) +{ + char tmp_buf[3], *buf; + size_t alloc_len; + + /* len <= 45 -> expanded to 60 + len byte + new line */ + alloc_len = shar->work.length + 62; + if (archive_string_ensure(&shar->work, alloc_len) == NULL) + __archive_errx(1, "Out of memory"); + + buf = shar->work.s + shar->work.length; + *buf++ = UUENC(len); + while (len >= 3) { + uuencode_group(inbuf, buf); + len -= 3; + inbuf += 3; + buf += 4; + } + if (len != 0) { + tmp_buf[0] = inbuf[0]; + if (len == 1) + tmp_buf[1] = '\0'; + else + tmp_buf[1] = inbuf[1]; + tmp_buf[2] = '\0'; + uuencode_group(inbuf, buf); + buf += 4; + } + *buf++ = '\n'; + if ((buf - shar->work.s) > (ptrdiff_t)(shar->work.length + 62)) + __archive_errx(1, "Buffer overflow"); + shar->work.length = buf - shar->work.s; +} + +static ssize_t +archive_write_shar_data_uuencode(struct archive_write *a, const void *buff, + size_t length) +{ + struct shar *shar; + const char *src; + size_t n; + int ret; + + shar = (struct shar *)a->format_data; + if (!shar->has_data) + return (ARCHIVE_OK); + src = (const char *)buff; + + if (shar->outpos != 0) { + n = 45 - shar->outpos; + if (n > length) + n = length; + memcpy(shar->outbuff + shar->outpos, src, n); + if (shar->outpos + n < 45) { + shar->outpos += n; + return length; + } + uuencode_line(shar, shar->outbuff, 45); + src += n; + n = length - n; + } else { + n = length; + } + + while (n >= 45) { + uuencode_line(shar, src, 45); + src += 45; + n -= 45; + + if (shar->work.length < 65536) + continue; + ret = (*a->compressor.write)(a, shar->work.s, + shar->work.length); + if (ret != ARCHIVE_OK) + return (ARCHIVE_FATAL); + archive_string_empty(&shar->work); + } + if (n != 0) { + memcpy(shar->outbuff, src, n); + shar->outpos = n; + } + return (length); +} + +static int +archive_write_shar_finish_entry(struct archive_write *a) +{ + const char *g, *p, *u; + struct shar *shar; + int ret; + + shar = (struct shar *)a->format_data; + if (shar->entry == NULL) + return (0); + + if (shar->dump) { + /* Finish uuencoded data. */ + if (shar->has_data) { + if (shar->outpos > 0) + uuencode_line(shar, shar->outbuff, + shar->outpos); + archive_strcat(&shar->work, "`\nend\n"); + archive_strcat(&shar->work, "SHAR_END\n"); + } + /* Restore file mode, owner, flags. */ + /* + * TODO: Don't immediately restore mode for + * directories; defer that to end of script. + */ + archive_string_sprintf(&shar->work, "chmod %o ", + archive_entry_mode(shar->entry) & 07777); + shar_quote(&shar->work, archive_entry_pathname(shar->entry), 1); + archive_strcat(&shar->work, "\n"); + + u = archive_entry_uname(shar->entry); + g = archive_entry_gname(shar->entry); + if (u != NULL || g != NULL) { + archive_strcat(&shar->work, "chown "); + if (u != NULL) + shar_quote(&shar->work, u, 1); + if (g != NULL) { + archive_strcat(&shar->work, ":"); + shar_quote(&shar->work, g, 1); + } + shar_quote(&shar->work, + archive_entry_pathname(shar->entry), 1); + archive_strcat(&shar->work, "\n"); + } + + if ((p = archive_entry_fflags_text(shar->entry)) != NULL) { + archive_string_sprintf(&shar->work, "chflags %s ", + p, archive_entry_pathname(shar->entry)); + shar_quote(&shar->work, + archive_entry_pathname(shar->entry), 1); + archive_strcat(&shar->work, "\n"); + } + + /* TODO: restore ACLs */ + + } else { + if (shar->has_data) { + /* Finish sed-encoded data: ensure last line ends. */ + if (!shar->end_of_line) + archive_strappend_char(&shar->work, '\n'); + archive_strcat(&shar->work, "SHAR_END\n"); + } + } + + archive_entry_free(shar->entry); + shar->entry = NULL; + + if (shar->work.length < 65536) + return (ARCHIVE_OK); + + ret = (*a->compressor.write)(a, shar->work.s, shar->work.length); + if (ret != ARCHIVE_OK) + return (ARCHIVE_FATAL); + archive_string_empty(&shar->work); + + return (ARCHIVE_OK); +} + +static int +archive_write_shar_finish(struct archive_write *a) +{ + struct shar *shar; + int ret; + + /* + * TODO: Accumulate list of directory names/modes and + * fix them all up at end-of-archive. + */ + + shar = (struct shar *)a->format_data; + + /* + * Only write the end-of-archive markers if the archive was + * actually started. This avoids problems if someone sets + * shar format, then sets another format (which would invoke + * shar_finish to free the format-specific data). + */ + if (shar->wrote_header == 0) + return (ARCHIVE_OK); + + archive_strcat(&shar->work, "exit\n"); + + ret = (*a->compressor.write)(a, shar->work.s, shar->work.length); + if (ret != ARCHIVE_OK) + return (ARCHIVE_FATAL); + + /* Shar output is never padded. */ + archive_write_set_bytes_in_last_block(&a->archive, 1); + /* + * TODO: shar should also suppress padding of + * uncompressed data within gzip/bzip2 streams. + */ + + return (ARCHIVE_OK); +} + +static int +archive_write_shar_destroy(struct archive_write *a) +{ + struct shar *shar; + + shar = (struct shar *)a->format_data; + if (shar == NULL) + return (ARCHIVE_OK); + + archive_entry_free(shar->entry); + free(shar->last_dir); + archive_string_free(&(shar->work)); + archive_string_free(&(shar->quoted_name)); + free(shar); + a->format_data = NULL; + return (ARCHIVE_OK); +} diff --git a/lib/libarchive/archive_write_set_format_ustar.c b/lib/libarchive/archive_write_set_format_ustar.c new file mode 100644 index 000000000..208a08a1e --- /dev/null +++ b/lib/libarchive/archive_write_set_format_ustar.c @@ -0,0 +1,690 @@ +/*- + * Copyright (c) 2003-2007 Tim Kientzle + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_write_set_format_ustar.c 191579 2009-04-27 18:35:03Z kientzle $"); + + +#ifdef HAVE_ERRNO_H +#include +#endif +#include +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif + +#include "archive.h" +#include "archive_entry.h" +#include "archive_private.h" +#include "archive_write_private.h" + +#ifdef __minix +#include "minix_utils.h" +#endif + +#ifndef __minix +struct ustar { + uint64_t entry_bytes_remaining; + uint64_t entry_padding; +}; +#else +struct ustar { + size_t entry_bytes_remaining; + off_t entry_padding; +}; +#endif +/* + * Define structure of POSIX 'ustar' tar header. + */ +#define USTAR_name_offset 0 +#define USTAR_name_size 100 +#define USTAR_mode_offset 100 +#define USTAR_mode_size 6 +#define USTAR_mode_max_size 8 +#define USTAR_uid_offset 108 +#define USTAR_uid_size 6 +#define USTAR_uid_max_size 8 +#define USTAR_gid_offset 116 +#define USTAR_gid_size 6 +#define USTAR_gid_max_size 8 +#define USTAR_size_offset 124 +#define USTAR_size_size 11 +#define USTAR_size_max_size 12 +#define USTAR_mtime_offset 136 +#define USTAR_mtime_size 11 +#define USTAR_mtime_max_size 11 +#define USTAR_checksum_offset 148 +#define USTAR_checksum_size 8 +#define USTAR_typeflag_offset 156 +#define USTAR_typeflag_size 1 +#define USTAR_linkname_offset 157 +#define USTAR_linkname_size 100 +#define USTAR_magic_offset 257 +#define USTAR_magic_size 6 +#define USTAR_version_offset 263 +#define USTAR_version_size 2 +#define USTAR_uname_offset 265 +#define USTAR_uname_size 32 +#define USTAR_gname_offset 297 +#define USTAR_gname_size 32 +#define USTAR_rdevmajor_offset 329 +#define USTAR_rdevmajor_size 6 +#define USTAR_rdevmajor_max_size 8 +#define USTAR_rdevminor_offset 337 +#define USTAR_rdevminor_size 6 +#define USTAR_rdevminor_max_size 8 +#define USTAR_prefix_offset 345 +#define USTAR_prefix_size 155 +#define USTAR_padding_offset 500 +#define USTAR_padding_size 12 + +/* + * A filled-in copy of the header for initialization. + */ +static const char template_header[] = { + /* name: 100 bytes */ + 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, + 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, + 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, + 0,0,0,0, + /* Mode, space-null termination: 8 bytes */ + '0','0','0','0','0','0', ' ','\0', + /* uid, space-null termination: 8 bytes */ + '0','0','0','0','0','0', ' ','\0', + /* gid, space-null termination: 8 bytes */ + '0','0','0','0','0','0', ' ','\0', + /* size, space termation: 12 bytes */ + '0','0','0','0','0','0','0','0','0','0','0', ' ', + /* mtime, space termation: 12 bytes */ + '0','0','0','0','0','0','0','0','0','0','0', ' ', + /* Initial checksum value: 8 spaces */ + ' ',' ',' ',' ',' ',' ',' ',' ', + /* Typeflag: 1 byte */ + '0', /* '0' = regular file */ + /* Linkname: 100 bytes */ + 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, + 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, + 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, + 0,0,0,0, + /* Magic: 6 bytes, Version: 2 bytes */ + 'u','s','t','a','r','\0', '0','0', + /* Uname: 32 bytes */ + 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, + /* Gname: 32 bytes */ + 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, + /* rdevmajor + space/null padding: 8 bytes */ + '0','0','0','0','0','0', ' ','\0', + /* rdevminor + space/null padding: 8 bytes */ + '0','0','0','0','0','0', ' ','\0', + /* Prefix: 155 bytes */ + 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, + 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, + 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, + 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, + 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0, 0,0,0, + /* Padding: 12 bytes */ + 0,0,0,0,0,0,0,0, 0,0,0,0 +}; + +static ssize_t archive_write_ustar_data(struct archive_write *a, const void *buff, + size_t s); +static int archive_write_ustar_destroy(struct archive_write *); +static int archive_write_ustar_finish(struct archive_write *); +static int archive_write_ustar_finish_entry(struct archive_write *); +static int archive_write_ustar_header(struct archive_write *, + struct archive_entry *entry); +#ifndef __minix +static int format_256(int64_t, char *, int); +static int format_number(int64_t, char *, int size, int max, int strict); +static int format_octal(int64_t, char *, int); +#else +static int format_256(int32_t, char *, int); +static int format_number(int32_t, char *, int size, int max, int strict); +static int format_octal(int32_t, char *, int); +#endif +static int write_nulls(struct archive_write *a, size_t); + +/* + * Set output format to 'ustar' format. + */ +int +archive_write_set_format_ustar(struct archive *_a) +{ + struct archive_write *a = (struct archive_write *)_a; + struct ustar *ustar; + + /* If someone else was already registered, unregister them. */ + if (a->format_destroy != NULL) + (a->format_destroy)(a); + + /* Basic internal sanity test. */ + if (sizeof(template_header) != 512) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, "Internal: template_header wrong size: %d should be 512", sizeof(template_header)); + return (ARCHIVE_FATAL); + } + + ustar = (struct ustar *)malloc(sizeof(*ustar)); + if (ustar == NULL) { + archive_set_error(&a->archive, ENOMEM, "Can't allocate ustar data"); + return (ARCHIVE_FATAL); + } + memset(ustar, 0, sizeof(*ustar)); + a->format_data = ustar; + + a->pad_uncompressed = 1; /* Mimic gtar in this respect. */ + a->format_name = "ustar"; + a->format_write_header = archive_write_ustar_header; + a->format_write_data = archive_write_ustar_data; + a->format_finish = archive_write_ustar_finish; + a->format_destroy = archive_write_ustar_destroy; + a->format_finish_entry = archive_write_ustar_finish_entry; + a->archive.archive_format = ARCHIVE_FORMAT_TAR_USTAR; + a->archive.archive_format_name = "POSIX ustar"; + return (ARCHIVE_OK); +} + +static int +archive_write_ustar_header(struct archive_write *a, struct archive_entry *entry) +{ + char buff[512]; + int ret, ret2; + struct ustar *ustar; + + ustar = (struct ustar *)a->format_data; + + /* Only regular files (not hardlinks) have data. */ + if (archive_entry_hardlink(entry) != NULL || + archive_entry_symlink(entry) != NULL || + !(archive_entry_filetype(entry) == AE_IFREG)) + archive_entry_set_size(entry, 0); + + if (AE_IFDIR == archive_entry_filetype(entry)) { + const char *p; + char *t; + /* + * Ensure a trailing '/'. Modify the entry so + * the client sees the change. + */ + p = archive_entry_pathname(entry); + if (p[strlen(p) - 1] != '/') { + t = (char *)malloc(strlen(p) + 2); + if (t == NULL) { + archive_set_error(&a->archive, ENOMEM, + "Can't allocate ustar data"); + return(ARCHIVE_FATAL); + } + strcpy(t, p); + strcat(t, "/"); + archive_entry_copy_pathname(entry, t); + free(t); + } + } + + ret = __archive_write_format_header_ustar(a, buff, entry, -1, 1); + if (ret < ARCHIVE_WARN) + return (ret); + ret2 = (a->compressor.write)(a, buff, 512); + if (ret2 < ARCHIVE_WARN) + return (ret2); + if (ret2 < ret) + ret = ret2; + + ustar->entry_bytes_remaining = archive_entry_size(entry); +#ifndef __minix + ustar->entry_padding = 0x1ff & (-(int64_t)ustar->entry_bytes_remaining); +#else + ustar->entry_padding = 0x1ff & (-(int32_t)ustar->entry_bytes_remaining); +#endif + return (ret); +} + +/* + * Format a basic 512-byte "ustar" header. + * + * Returns -1 if format failed (due to field overflow). + * Note that this always formats as much of the header as possible. + * If "strict" is set to zero, it will extend numeric fields as + * necessary (overwriting terminators or using base-256 extensions). + * + * This is exported so that other 'tar' formats can use it. + */ +int +__archive_write_format_header_ustar(struct archive_write *a, char h[512], + struct archive_entry *entry, int tartype, int strict) +{ + unsigned int checksum; + int i, ret; + size_t copy_length; + const char *p, *pp; + int mytartype; + + ret = 0; + mytartype = -1; + /* + * The "template header" already includes the "ustar" + * signature, various end-of-field markers and other required + * elements. + */ + memcpy(h, &template_header, 512); + + /* + * Because the block is already null-filled, and strings + * are allowed to exactly fill their destination (without null), + * I use memcpy(dest, src, strlen()) here a lot to copy strings. + */ + + pp = archive_entry_pathname(entry); + if (strlen(pp) <= USTAR_name_size) + memcpy(h + USTAR_name_offset, pp, strlen(pp)); + else { + /* Store in two pieces, splitting at a '/'. */ + p = strchr(pp + strlen(pp) - USTAR_name_size - 1, '/'); + /* + * Look for the next '/' if we chose the first character + * as the separator. (ustar format doesn't permit + * an empty prefix.) + */ + if (p == pp) + p = strchr(p + 1, '/'); + /* Fail if the name won't fit. */ + if (!p) { + /* No separator. */ + archive_set_error(&a->archive, ENAMETOOLONG, + "Pathname too long"); + ret = ARCHIVE_FAILED; + } else if (p[1] == '\0') { + /* + * The only feasible separator is a final '/'; + * this would result in a non-empty prefix and + * an empty name, which POSIX doesn't + * explicity forbid, but it just feels wrong. + */ + archive_set_error(&a->archive, ENAMETOOLONG, + "Pathname too long"); + ret = ARCHIVE_FAILED; + } else if (p > pp + USTAR_prefix_size) { + /* Prefix is too long. */ + archive_set_error(&a->archive, ENAMETOOLONG, + "Pathname too long"); + ret = ARCHIVE_FAILED; + } else { + /* Copy prefix and remainder to appropriate places */ + memcpy(h + USTAR_prefix_offset, pp, p - pp); + memcpy(h + USTAR_name_offset, p + 1, pp + strlen(pp) - p - 1); + } + } + + p = archive_entry_hardlink(entry); + if (p != NULL) + mytartype = '1'; + else + p = archive_entry_symlink(entry); + if (p != NULL && p[0] != '\0') { + copy_length = strlen(p); + if (copy_length > USTAR_linkname_size) { + archive_set_error(&a->archive, ENAMETOOLONG, + "Link contents too long"); + ret = ARCHIVE_FAILED; + copy_length = USTAR_linkname_size; + } + memcpy(h + USTAR_linkname_offset, p, copy_length); + } + + p = archive_entry_uname(entry); + if (p != NULL && p[0] != '\0') { + copy_length = strlen(p); + if (copy_length > USTAR_uname_size) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Username too long"); + ret = ARCHIVE_FAILED; + copy_length = USTAR_uname_size; + } + memcpy(h + USTAR_uname_offset, p, copy_length); + } + + p = archive_entry_gname(entry); + if (p != NULL && p[0] != '\0') { + copy_length = strlen(p); + if (strlen(p) > USTAR_gname_size) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Group name too long"); + ret = ARCHIVE_FAILED; + copy_length = USTAR_gname_size; + } + memcpy(h + USTAR_gname_offset, p, copy_length); + } + + if (format_number(archive_entry_mode(entry) & 07777, h + USTAR_mode_offset, USTAR_mode_size, USTAR_mode_max_size, strict)) { + archive_set_error(&a->archive, ERANGE, "Numeric mode too large"); + ret = ARCHIVE_FAILED; + } + + if (format_number(archive_entry_uid(entry), h + USTAR_uid_offset, USTAR_uid_size, USTAR_uid_max_size, strict)) { + archive_set_error(&a->archive, ERANGE, "Numeric user ID too large"); + ret = ARCHIVE_FAILED; + } + + if (format_number(archive_entry_gid(entry), h + USTAR_gid_offset, USTAR_gid_size, USTAR_gid_max_size, strict)) { + archive_set_error(&a->archive, ERANGE, "Numeric group ID too large"); + ret = ARCHIVE_FAILED; + } + + if (format_number(archive_entry_size(entry), h + USTAR_size_offset, USTAR_size_size, USTAR_size_max_size, strict)) { + archive_set_error(&a->archive, ERANGE, "File size out of range"); + ret = ARCHIVE_FAILED; + } + + if (format_number(archive_entry_mtime(entry), h + USTAR_mtime_offset, USTAR_mtime_size, USTAR_mtime_max_size, strict)) { + archive_set_error(&a->archive, ERANGE, + "File modification time too large"); + ret = ARCHIVE_FAILED; + } + + if (archive_entry_filetype(entry) == AE_IFBLK + || archive_entry_filetype(entry) == AE_IFCHR) { + if (format_number(archive_entry_rdevmajor(entry), h + USTAR_rdevmajor_offset, + USTAR_rdevmajor_size, USTAR_rdevmajor_max_size, strict)) { + archive_set_error(&a->archive, ERANGE, + "Major device number too large"); + ret = ARCHIVE_FAILED; + } + + if (format_number(archive_entry_rdevminor(entry), h + USTAR_rdevminor_offset, + USTAR_rdevminor_size, USTAR_rdevminor_max_size, strict)) { + archive_set_error(&a->archive, ERANGE, + "Minor device number too large"); + ret = ARCHIVE_FAILED; + } + } + + if (tartype >= 0) { + h[USTAR_typeflag_offset] = tartype; + } else if (mytartype >= 0) { + h[USTAR_typeflag_offset] = mytartype; + } else { + switch (archive_entry_filetype(entry)) { + case AE_IFREG: h[USTAR_typeflag_offset] = '0' ; break; + case AE_IFLNK: h[USTAR_typeflag_offset] = '2' ; break; + case AE_IFCHR: h[USTAR_typeflag_offset] = '3' ; break; + case AE_IFBLK: h[USTAR_typeflag_offset] = '4' ; break; + case AE_IFDIR: h[USTAR_typeflag_offset] = '5' ; break; + case AE_IFIFO: h[USTAR_typeflag_offset] = '6' ; break; + case AE_IFSOCK: + archive_set_error(&a->archive, + ARCHIVE_ERRNO_FILE_FORMAT, + "tar format cannot archive socket"); + return (ARCHIVE_FAILED); + default: + archive_set_error(&a->archive, + ARCHIVE_ERRNO_FILE_FORMAT, + "tar format cannot archive this (mode=0%lo)", + (unsigned long)archive_entry_mode(entry)); + ret = ARCHIVE_FAILED; + } + } + + checksum = 0; + for (i = 0; i < 512; i++) + checksum += 255 & (unsigned int)h[i]; + h[USTAR_checksum_offset + 6] = '\0'; /* Can't be pre-set in the template. */ + /* h[USTAR_checksum_offset + 7] = ' '; */ /* This is pre-set in the template. */ + format_octal(checksum, h + USTAR_checksum_offset, 6); + return (ret); +} + +/* + * Format a number into a field, with some intelligence. + */ +#ifndef __minix +static int +format_number(int64_t v, char *p, int s, int maxsize, int strict) +{ + int64_t limit; + + limit = ((int64_t)1 << (s*3)); + + /* "Strict" only permits octal values with proper termination. */ + if (strict) + return (format_octal(v, p, s)); + + /* + * In non-strict mode, we allow the number to overwrite one or + * more bytes of the field termination. Even old tar + * implementations should be able to handle this with no + * problem. + */ + if (v >= 0) { + while (s <= maxsize) { + if (v < limit) + return (format_octal(v, p, s)); + s++; + limit <<= 3; + } + } + + /* Base-256 can handle any number, positive or negative. */ + return (format_256(v, p, maxsize)); +} +#else +static int +format_number(int32_t v, char *p, int s, int maxsize, int strict) +{ + /* s could be 11 in some cases causing limit to be shifted by + * greater than 32 bits so we need a u64_t here + */ + u64_t limit; + + /* limit = (1 << (s*3)) */ + limit = lshift64(cvu64(1), s*3); + + /* "Strict" only permits octal values with proper termination. */ + if (strict) + return (format_octal(v, p, s)); + + /* + * In non-strict mode, we allow the number to overwrite one or + * more bytes of the field termination. Even old tar + * implementations should be able to handle this with no + * problem. + */ + if (v >= 0) { + while (s <= maxsize) { + /* if (v < limit) */ + if (cmp64ul(limit, v) > 0) + return (format_octal(v, p, s)); + s++; + /* limit <<= 3 */ + limit = lshift64(limit, 3); + } + } + + /* Base-256 can handle any number, positive or negative. */ + return (format_256(v, p, maxsize)); +} +#endif + +/* + * Format a number into the specified field using base-256. + */ +#ifndef __minix +static int +format_256(int64_t v, char *p, int s) +{ + p += s; + while (s-- > 0) { + *--p = (char)(v & 0xff); + v >>= 8; + } + *p |= 0x80; /* Set the base-256 marker bit. */ + return (0); +} +#else +static int +format_256(int32_t v, char *p, int s) +{ + p += s; + while (s-- > 0) { + *--p = (char)(v & 0xff); + v >>= 8; + } + *p |= 0x80; /* Set the base-256 marker bit. */ + return (0); +} +#endif +/* + * Format a number into the specified field. + */ +#ifndef __minix +static int +format_octal(int64_t v, char *p, int s) +{ + int len; + + len = s; + + /* Octal values can't be negative, so use 0. */ + if (v < 0) { + while (len-- > 0) + *p++ = '0'; + return (-1); + } + + p += s; /* Start at the end and work backwards. */ + while (s-- > 0) { + *--p = (char)('0' + (v & 7)); + v >>= 3; + } + + if (v == 0) + return (0); + + /* If it overflowed, fill field with max value. */ + while (len-- > 0) + *p++ = '7'; + + return (-1); +} +#else +static int +format_octal(int32_t v, char *p, int s) +{ + int len; + + len = s; + + /* Octal values can't be negative, so use 0. */ + if (v < 0) { + while (len-- > 0) + *p++ = '0'; + return (-1); + } + + p += s; /* Start at the end and work backwards. */ + while (s-- > 0) { + *--p = (char)('0' + (v & 7)); + v >>= 3; + } + + if (v == 0) + return (0); + + /* If it overflowed, fill field with max value. */ + while (len-- > 0) + *p++ = '7'; + + return (-1); +} +#endif + +static int +archive_write_ustar_finish(struct archive_write *a) +{ + int r; + + if (a->compressor.write == NULL) + return (ARCHIVE_OK); + + r = write_nulls(a, 512*2); + return (r); +} + +static int +archive_write_ustar_destroy(struct archive_write *a) +{ + struct ustar *ustar; + + ustar = (struct ustar *)a->format_data; + free(ustar); + a->format_data = NULL; + return (ARCHIVE_OK); +} + +static int +archive_write_ustar_finish_entry(struct archive_write *a) +{ + struct ustar *ustar; + int ret; + + ustar = (struct ustar *)a->format_data; + ret = write_nulls(a, + ustar->entry_bytes_remaining + ustar->entry_padding); + ustar->entry_bytes_remaining = ustar->entry_padding = 0; + return (ret); +} + +static int +write_nulls(struct archive_write *a, size_t padding) +{ + int ret; + size_t to_write; + + while (padding > 0) { + to_write = padding < a->null_length ? padding : a->null_length; + ret = (a->compressor.write)(a, a->nulls, to_write); + if (ret != ARCHIVE_OK) + return (ret); + padding -= to_write; + } + return (ARCHIVE_OK); +} + +static ssize_t +archive_write_ustar_data(struct archive_write *a, const void *buff, size_t s) +{ + struct ustar *ustar; + int ret; + + ustar = (struct ustar *)a->format_data; + if (s > ustar->entry_bytes_remaining) + s = ustar->entry_bytes_remaining; + ret = (a->compressor.write)(a, buff, s); + ustar->entry_bytes_remaining -= s; + if (ret != ARCHIVE_OK) + return (ret); + return (s); +} diff --git a/lib/libarchive/archive_write_set_format_zip.c b/lib/libarchive/archive_write_set_format_zip.c new file mode 100644 index 000000000..ca51e72b9 --- /dev/null +++ b/lib/libarchive/archive_write_set_format_zip.c @@ -0,0 +1,681 @@ +/*- + * Copyright (c) 2008 Anselm Strauss + * Copyright (c) 2009 Joerg Sonnenberger + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +/* + * Development supported by Google Summer of Code 2008. + */ + +/* + * The current implementation is very limited: + * + * - No encryption support. + * - No ZIP64 support. + * - No support for splitting and spanning. + * - Only supports regular file and folder entries. + * + * Note that generally data in ZIP files is little-endian encoded, + * with some exceptions. + * + * TODO: Since Libarchive is generally 64bit oriented, but this implementation + * does not yet support sizes exceeding 32bit, it is highly fragile for + * big archives. This should change when ZIP64 is finally implemented, otherwise + * some serious checking has to be done. + * + */ + +#include "archive_platform.h" +__FBSDID("$FreeBSD: head/lib/libarchive/archive_write_set_format_zip.c 201168 2009-12-29 06:15:32Z kientzle $"); + +#ifdef HAVE_ERRNO_H +#include +#endif +#include +#ifdef HAVE_STDLIB_H +#include +#endif +#ifdef HAVE_STRING_H +#include +#endif +#ifdef HAVE_ZLIB_H +#include +#endif + +#include "archive.h" +#include "archive_endian.h" +#include "archive_entry.h" +#include "archive_private.h" +#include "archive_write_private.h" + +#ifndef HAVE_ZLIB_H +#include "archive_crc32.h" +#endif + +#define ZIP_SIGNATURE_LOCAL_FILE_HEADER 0x04034b50 +#define ZIP_SIGNATURE_DATA_DESCRIPTOR 0x08074b50 +#define ZIP_SIGNATURE_FILE_HEADER 0x02014b50 +#define ZIP_SIGNATURE_CENTRAL_DIRECTORY_END 0x06054b50 +#define ZIP_SIGNATURE_EXTRA_TIMESTAMP 0x5455 +#define ZIP_SIGNATURE_EXTRA_UNIX 0x7855 +#define ZIP_VERSION_EXTRACT 0x0014 /* ZIP version 2.0 is needed. */ +#define ZIP_VERSION_BY 0x0314 /* Made by UNIX, using ZIP version 2.0. */ +#define ZIP_FLAGS 0x08 /* Flagging bit 3 (count from 0) for using data descriptor. */ + +enum compression { + COMPRESSION_STORE = 0 +#ifdef HAVE_ZLIB_H + , + COMPRESSION_DEFLATE = 8 +#endif +}; + +static ssize_t archive_write_zip_data(struct archive_write *, const void *buff, size_t s); +static int archive_write_zip_finish(struct archive_write *); +static int archive_write_zip_destroy(struct archive_write *); +static int archive_write_zip_finish_entry(struct archive_write *); +static int archive_write_zip_header(struct archive_write *, struct archive_entry *); +static unsigned int dos_time(const time_t); +static size_t path_length(struct archive_entry *); +static int write_path(struct archive_entry *, struct archive_write *); + +struct zip_local_file_header { + char signature[4]; + char version[2]; + char flags[2]; + char compression[2]; + char timedate[4]; + char crc32[4]; + char compressed_size[4]; + char uncompressed_size[4]; + char filename_length[2]; + char extra_length[2]; +}; + +struct zip_file_header { + char signature[4]; + char version_by[2]; + char version_extract[2]; + char flags[2]; + char compression[2]; + char timedate[4]; + char crc32[4]; + char compressed_size[4]; + char uncompressed_size[4]; + char filename_length[2]; + char extra_length[2]; + char comment_length[2]; + char disk_number[2]; + char attributes_internal[2]; + char attributes_external[4]; + char offset[4]; +}; + +struct zip_data_descriptor { + char signature[4]; /* Not mandatory, but recommended by specification. */ + char crc32[4]; + char compressed_size[4]; + char uncompressed_size[4]; +}; + +struct zip_extra_data_local { + char time_id[2]; + char time_size[2]; + char time_flag[1]; + char mtime[4]; + char atime[4]; + char ctime[4]; + char unix_id[2]; + char unix_size[2]; + char unix_uid[2]; + char unix_gid[2]; +}; + +struct zip_extra_data_central { + char time_id[2]; + char time_size[2]; + char time_flag[1]; + char mtime[4]; + char unix_id[2]; + char unix_size[2]; +}; + +struct zip_file_header_link { + struct zip_file_header_link *next; + struct archive_entry *entry; + off_t offset; + unsigned long crc32; + off_t compressed_size; + enum compression compression; +}; + +struct zip { + struct zip_data_descriptor data_descriptor; + struct zip_file_header_link *central_directory; + struct zip_file_header_link *central_directory_end; +#ifndef __minix + int64_t offset; + int64_t written_bytes; + int64_t remaining_data_bytes; +#else + off_t offset; + size_t written_bytes; + size_t remaining_data_bytes; +#endif + enum compression compression; + +#ifdef HAVE_ZLIB_H + z_stream stream; + size_t len_buf; + unsigned char *buf; +#endif +}; + +struct zip_central_directory_end { + char signature[4]; + char disk[2]; + char start_disk[2]; + char entries_disk[2]; + char entries[2]; + char size[4]; + char offset[4]; + char comment_length[2]; +}; + +static int +archive_write_zip_options(struct archive_write *a, const char *key, + const char *value) +{ + struct zip *zip = a->format_data; + + if (strcmp(key, "compression") == 0) { + if (strcmp(value, "deflate") == 0) { +#ifdef HAVE_ZLIB_H + zip->compression = COMPRESSION_DEFLATE; +#else + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "deflate compression not supported"); + return ARCHIVE_WARN; +#endif + } else if (strcmp(value, "store") == 0) + zip->compression = COMPRESSION_STORE; + else + return (ARCHIVE_WARN); + return (ARCHIVE_OK); + } + return (ARCHIVE_WARN); +} + +int +archive_write_set_format_zip(struct archive *_a) +{ + struct archive_write *a = (struct archive_write *)_a; + struct zip *zip; + + /* If another format was already registered, unregister it. */ + if (a->format_destroy != NULL) + (a->format_destroy)(a); + + zip = (struct zip *) calloc(1, sizeof(*zip)); + if (zip == NULL) { + archive_set_error(&a->archive, ENOMEM, "Can't allocate zip data"); + return (ARCHIVE_FATAL); + } + zip->central_directory = NULL; + zip->central_directory_end = NULL; + zip->offset = 0; + zip->written_bytes = 0; + zip->remaining_data_bytes = 0; + +#ifdef HAVE_ZLIB_H + zip->compression = COMPRESSION_DEFLATE; + zip->len_buf = 65536; + zip->buf = malloc(zip->len_buf); + if (zip->buf == NULL) { + archive_set_error(&a->archive, ENOMEM, "Can't allocate compression buffer"); + return (ARCHIVE_FATAL); + } +#else + zip->compression = COMPRESSION_STORE; +#endif + + a->format_data = zip; + + a->pad_uncompressed = 0; /* Actually not needed for now, since no compression support yet. */ + a->format_name = "zip"; + a->format_options = archive_write_zip_options; + a->format_write_header = archive_write_zip_header; + a->format_write_data = archive_write_zip_data; + a->format_finish_entry = archive_write_zip_finish_entry; + a->format_finish = archive_write_zip_finish; + a->format_destroy = archive_write_zip_destroy; + a->archive.archive_format = ARCHIVE_FORMAT_ZIP; + a->archive.archive_format_name = "ZIP"; + + archive_le32enc(&zip->data_descriptor.signature, + ZIP_SIGNATURE_DATA_DESCRIPTOR); + + return (ARCHIVE_OK); +} + +static int +archive_write_zip_header(struct archive_write *a, struct archive_entry *entry) +{ + struct zip *zip; + struct zip_local_file_header h; + struct zip_extra_data_local e; + struct zip_data_descriptor *d; + struct zip_file_header_link *l; + int ret; +#ifndef __minix + int64_t size; +#else + ssize_t size; +#endif + mode_t type; + + /* Entries other than a regular file or a folder are skipped. */ + type = archive_entry_filetype(entry); + if ((type != AE_IFREG) & (type != AE_IFDIR)) { + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, "Filetype not supported"); + return ARCHIVE_FAILED; + }; + + /* Directory entries should have a size of 0. */ + if (type == AE_IFDIR) + archive_entry_set_size(entry, 0); + + zip = a->format_data; + d = &zip->data_descriptor; + size = archive_entry_size(entry); + zip->remaining_data_bytes = size; + + /* Append archive entry to the central directory data. */ + l = (struct zip_file_header_link *) malloc(sizeof(*l)); + if (l == NULL) { + archive_set_error(&a->archive, ENOMEM, "Can't allocate zip header data"); + return (ARCHIVE_FATAL); + } + l->entry = archive_entry_clone(entry); + /* Initialize the CRC variable and potentially the local crc32(). */ + l->crc32 = crc32(0, NULL, 0); + l->compression = zip->compression; + l->compressed_size = 0; + l->next = NULL; + if (zip->central_directory == NULL) { + zip->central_directory = l; + } else { + zip->central_directory_end->next = l; + } + zip->central_directory_end = l; + + /* Store the offset of this header for later use in central directory. */ + l->offset = zip->written_bytes; + + memset(&h, 0, sizeof(h)); + archive_le32enc(&h.signature, ZIP_SIGNATURE_LOCAL_FILE_HEADER); + archive_le16enc(&h.version, ZIP_VERSION_EXTRACT); + archive_le16enc(&h.flags, ZIP_FLAGS); + archive_le16enc(&h.compression, zip->compression); + archive_le32enc(&h.timedate, dos_time(archive_entry_mtime(entry))); + archive_le16enc(&h.filename_length, (uint16_t)path_length(entry)); + + switch (zip->compression) { + case COMPRESSION_STORE: + /* Setting compressed and uncompressed sizes even when specification says + * to set to zero when using data descriptors. Otherwise the end of the + * data for an entry is rather difficult to find. */ + archive_le32enc(&h.compressed_size, size); + archive_le32enc(&h.uncompressed_size, size); + break; +#ifdef HAVE_ZLIB_H + case COMPRESSION_DEFLATE: + archive_le32enc(&h.uncompressed_size, size); + + zip->stream.zalloc = Z_NULL; + zip->stream.zfree = Z_NULL; + zip->stream.opaque = Z_NULL; + zip->stream.next_out = zip->buf; + zip->stream.avail_out = zip->len_buf; + if (deflateInit2(&zip->stream, Z_DEFAULT_COMPRESSION, Z_DEFLATED, + -15, 8, Z_DEFAULT_STRATEGY) != Z_OK) { + archive_set_error(&a->archive, ENOMEM, "Can't init deflate compressor"); + return (ARCHIVE_FATAL); + } + break; +#endif + } + + /* Formatting extra data. */ + archive_le16enc(&h.extra_length, sizeof(e)); + archive_le16enc(&e.time_id, ZIP_SIGNATURE_EXTRA_TIMESTAMP); + archive_le16enc(&e.time_size, sizeof(e.time_flag) + + sizeof(e.mtime) + sizeof(e.atime) + sizeof(e.ctime)); + e.time_flag[0] = 0x07; + archive_le32enc(&e.mtime, archive_entry_mtime(entry)); + archive_le32enc(&e.atime, archive_entry_atime(entry)); + archive_le32enc(&e.ctime, archive_entry_ctime(entry)); + + archive_le16enc(&e.unix_id, ZIP_SIGNATURE_EXTRA_UNIX); + archive_le16enc(&e.unix_size, sizeof(e.unix_uid) + sizeof(e.unix_gid)); + archive_le16enc(&e.unix_uid, archive_entry_uid(entry)); + archive_le16enc(&e.unix_gid, archive_entry_gid(entry)); + + archive_le32enc(&d->uncompressed_size, size); + + ret = (a->compressor.write)(a, &h, sizeof(h)); + if (ret != ARCHIVE_OK) + return (ARCHIVE_FATAL); + zip->written_bytes += sizeof(h); + + ret = write_path(entry, a); + if (ret <= ARCHIVE_OK) + return (ARCHIVE_FATAL); + zip->written_bytes += ret; + + ret = (a->compressor.write)(a, &e, sizeof(e)); + if (ret != ARCHIVE_OK) + return (ARCHIVE_FATAL); + zip->written_bytes += sizeof(e); + + return (ARCHIVE_OK); +} + +static ssize_t +archive_write_zip_data(struct archive_write *a, const void *buff, size_t s) +{ + int ret; + struct zip *zip = a->format_data; + struct zip_file_header_link *l = zip->central_directory_end; + +#ifndef __minix + if ((int64_t)s > zip->remaining_data_bytes) + s = (size_t)zip->remaining_data_bytes; +#else + if (s > zip->remaining_data_bytes) + s = (size_t)zip->remaining_data_bytes; +#endif + if (s == 0) return 0; + + switch (zip->compression) { + case COMPRESSION_STORE: + ret = (a->compressor.write)(a, buff, s); + if (ret != ARCHIVE_OK) return (ret); + zip->written_bytes += s; + zip->remaining_data_bytes -= s; + l->compressed_size += s; + l->crc32 = crc32(l->crc32, buff, s); + return (s); +#if HAVE_ZLIB_H + case COMPRESSION_DEFLATE: + zip->stream.next_in = (unsigned char*)(uintptr_t)buff; + zip->stream.avail_in = s; + do { + ret = deflate(&zip->stream, Z_NO_FLUSH); + if (ret == Z_STREAM_ERROR) + return (ARCHIVE_FATAL); + if (zip->stream.avail_out == 0) { + ret = (a->compressor.write)(a, zip->buf, zip->len_buf); + if (ret != ARCHIVE_OK) + return (ret); + l->compressed_size += zip->len_buf; + zip->written_bytes += zip->len_buf; + zip->stream.next_out = zip->buf; + zip->stream.avail_out = zip->len_buf; + } + } while (zip->stream.avail_in != 0); + zip->remaining_data_bytes -= s; + /* If we have it, use zlib's fast crc32() */ + l->crc32 = crc32(l->crc32, buff, s); + return (s); +#endif + + default: + archive_set_error(&a->archive, ARCHIVE_ERRNO_MISC, + "Invalid ZIP compression type"); + return ARCHIVE_FATAL; + } +} + +static int +archive_write_zip_finish_entry(struct archive_write *a) +{ + /* Write the data descripter after file data has been written. */ + int ret; + struct zip *zip = a->format_data; + struct zip_data_descriptor *d = &zip->data_descriptor; + struct zip_file_header_link *l = zip->central_directory_end; +#if HAVE_ZLIB_H + size_t reminder; +#endif + + switch(zip->compression) { + case COMPRESSION_STORE: + break; +#if HAVE_ZLIB_H + case COMPRESSION_DEFLATE: + for (;;) { + ret = deflate(&zip->stream, Z_FINISH); + if (ret == Z_STREAM_ERROR) + return (ARCHIVE_FATAL); + reminder = zip->len_buf - zip->stream.avail_out; + ret = (a->compressor.write)(a, zip->buf, reminder); + if (ret != ARCHIVE_OK) + return (ret); + l->compressed_size += reminder; + zip->written_bytes += reminder; + zip->stream.next_out = zip->buf; + if (zip->stream.avail_out != 0) + break; + zip->stream.avail_out = zip->len_buf; + } + deflateEnd(&zip->stream); + break; +#endif + } + + archive_le32enc(&d->crc32, l->crc32); + archive_le32enc(&d->compressed_size, l->compressed_size); + ret = (a->compressor.write)(a, d, sizeof(*d)); + if (ret != ARCHIVE_OK) + return (ARCHIVE_FATAL); + zip->written_bytes += sizeof(*d); + return (ARCHIVE_OK); +} + +static int +archive_write_zip_finish(struct archive_write *a) +{ + struct zip *zip; + struct zip_file_header_link *l; + struct zip_file_header h; + struct zip_central_directory_end end; + struct zip_extra_data_central e; + off_t offset_start, offset_end; + int entries; + int ret; + + zip = a->format_data; + l = zip->central_directory; + + /* + * Formatting central directory file header fields that are fixed for all entries. + * Fields not used (and therefor 0) are: + * + * - comment_length + * - disk_number + * - attributes_internal + */ + memset(&h, 0, sizeof(h)); + archive_le32enc(&h.signature, ZIP_SIGNATURE_FILE_HEADER); + archive_le16enc(&h.version_by, ZIP_VERSION_BY); + archive_le16enc(&h.version_extract, ZIP_VERSION_EXTRACT); + archive_le16enc(&h.flags, ZIP_FLAGS); + + entries = 0; + offset_start = zip->written_bytes; + + /* Formatting individual header fields per entry and + * writing each entry. */ + while (l != NULL) { + archive_le16enc(&h.compression, l->compression); + archive_le32enc(&h.timedate, dos_time(archive_entry_mtime(l->entry))); + archive_le32enc(&h.crc32, l->crc32); + archive_le32enc(&h.compressed_size, l->compressed_size); + archive_le32enc(&h.uncompressed_size, archive_entry_size(l->entry)); + archive_le16enc(&h.filename_length, (uint16_t)path_length(l->entry)); + archive_le16enc(&h.extra_length, sizeof(e)); + archive_le16enc(&h.attributes_external[2], archive_entry_mode(l->entry)); + archive_le32enc(&h.offset, l->offset); + + /* Formatting extra data. */ + archive_le16enc(&e.time_id, ZIP_SIGNATURE_EXTRA_TIMESTAMP); + archive_le16enc(&e.time_size, sizeof(e.mtime) + sizeof(e.time_flag)); + e.time_flag[0] = 0x07; + archive_le32enc(&e.mtime, archive_entry_mtime(l->entry)); + archive_le16enc(&e.unix_id, ZIP_SIGNATURE_EXTRA_UNIX); + archive_le16enc(&e.unix_size, 0x0000); + + ret = (a->compressor.write)(a, &h, sizeof(h)); + if (ret != ARCHIVE_OK) + return (ARCHIVE_FATAL); + zip->written_bytes += sizeof(h); + + ret = write_path(l->entry, a); + if (ret <= ARCHIVE_OK) + return (ARCHIVE_FATAL); + zip->written_bytes += ret; + + ret = (a->compressor.write)(a, &e, sizeof(e)); + if (ret != ARCHIVE_OK) + return (ARCHIVE_FATAL); + zip->written_bytes += sizeof(e); + + l = l->next; + entries++; + } + offset_end = zip->written_bytes; + + /* Formatting end of central directory. */ + memset(&end, 0, sizeof(end)); + archive_le32enc(&end.signature, ZIP_SIGNATURE_CENTRAL_DIRECTORY_END); + archive_le16enc(&end.entries_disk, entries); + archive_le16enc(&end.entries, entries); + archive_le32enc(&end.size, offset_end - offset_start); + archive_le32enc(&end.offset, offset_start); + + /* Writing end of central directory. */ + ret = (a->compressor.write)(a, &end, sizeof(end)); + if (ret != ARCHIVE_OK) + return (ARCHIVE_FATAL); + zip->written_bytes += sizeof(end); + return (ARCHIVE_OK); +} + +static int +archive_write_zip_destroy(struct archive_write *a) +{ + struct zip *zip; + struct zip_file_header_link *l; + + zip = a->format_data; + while (zip->central_directory != NULL) { + l = zip->central_directory; + zip->central_directory = l->next; + archive_entry_free(l->entry); + free(l); + } +#ifdef HAVE_ZLIB_H + free(zip->buf); +#endif + free(zip); + a->format_data = NULL; + return (ARCHIVE_OK); +} + +/* Convert into MSDOS-style date/time. */ +static unsigned int +dos_time(const time_t unix_time) +{ + struct tm *t; + unsigned int dt; + + /* This will not preserve time when creating/extracting the archive + * on two systems with different time zones. */ + t = localtime(&unix_time); + + dt = 0; + dt += ((t->tm_year - 80) & 0x7f) << 9; + dt += ((t->tm_mon + 1) & 0x0f) << 5; + dt += (t->tm_mday & 0x1f); + dt <<= 16; + dt += (t->tm_hour & 0x1f) << 11; + dt += (t->tm_min & 0x3f) << 5; + dt += (t->tm_sec & 0x3e) >> 1; /* Only counting every 2 seconds. */ + return dt; +} + +static size_t +path_length(struct archive_entry *entry) +{ + mode_t type; + const char *path; + + type = archive_entry_filetype(entry); + path = archive_entry_pathname(entry); + + if ((type == AE_IFDIR) & (path[strlen(path) - 1] != '/')) { + return strlen(path) + 1; + } else { + return strlen(path); + } +} + +static int +write_path(struct archive_entry *entry, struct archive_write *archive) +{ + int ret; + const char *path; + mode_t type; + size_t written_bytes; + + path = archive_entry_pathname(entry); + type = archive_entry_filetype(entry); + written_bytes = 0; + + ret = (archive->compressor.write)(archive, path, strlen(path)); + if (ret != ARCHIVE_OK) + return (ARCHIVE_FATAL); + written_bytes += strlen(path); + + /* Folders are recognized by a traling slash. */ + if ((type == AE_IFDIR) & (path[strlen(path) - 1] != '/')) { + ret = (archive->compressor.write)(archive, "/", 1); + if (ret != ARCHIVE_OK) + return (ARCHIVE_FATAL); + written_bytes += 1; + } + + return ((int)written_bytes); +} diff --git a/lib/libarchive/config.h b/lib/libarchive/config.h new file mode 100644 index 000000000..9e2a1f626 --- /dev/null +++ b/lib/libarchive/config.h @@ -0,0 +1,763 @@ +/* config.h. Generated from config.h.in by configure. */ +/* config.h.in. Generated from configure.ac by autoheader. */ + +/* Version number of bsdcpio */ +#define BSDCPIO_VERSION_STRING "2.8.3" + +/* Version number of bsdtar */ +#define BSDTAR_VERSION_STRING "2.8.3" + +/* Define to 1 if you have the `acl_create_entry' function. */ +/* #undef HAVE_ACL_CREATE_ENTRY */ + +/* Define to 1 if you have the `acl_get_link' function. */ +/* #undef HAVE_ACL_GET_LINK */ + +/* Define to 1 if you have the `acl_get_link_np' function. */ +/* #undef HAVE_ACL_GET_LINK_NP */ + +/* Define to 1 if you have the `acl_get_perm' function. */ +/* #undef HAVE_ACL_GET_PERM */ + +/* Define to 1 if you have the `acl_get_perm_np' function. */ +/* #undef HAVE_ACL_GET_PERM_NP */ + +/* Define to 1 if you have the `acl_init' function. */ +/* #undef HAVE_ACL_INIT */ + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_ACL_LIBACL_H */ + +/* Define to 1 if the system has the type `acl_permset_t'. */ +/* #undef HAVE_ACL_PERMSET_T */ + +/* Define to 1 if you have the `acl_set_fd' function. */ +/* #undef HAVE_ACL_SET_FD */ + +/* Define to 1 if you have the `acl_set_fd_np' function. */ +/* #undef HAVE_ACL_SET_FD_NP */ + +/* Define to 1 if you have the `acl_set_file' function. */ +/* #undef HAVE_ACL_SET_FILE */ + +/* True for systems with POSIX ACL support */ +/* #undef HAVE_ACL_USER */ + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_ATTR_XATTR_H */ + +/* Define to 1 if you have the header file. */ +#define HAVE_BZLIB_H 1 + +/* Define to 1 if you have the `chflags' function. */ +/* #undef HAVE_CHFLAGS */ + +/* Define to 1 if you have the `chown' function. */ +#define HAVE_CHOWN 1 + +/* Define to 1 if you have the `chroot' function. */ +#define HAVE_CHROOT 1 + +/* Define to 1 if you have the header file. */ +#define HAVE_CTYPE_H 1 + +/* Define to 1 if you have the `cygwin_conv_path' function. */ +/* #undef HAVE_CYGWIN_CONV_PATH */ + +/* Define to 1 if you have the declaration of `INT64_MAX', and to 0 if you + don't. */ +#define HAVE_DECL_INT64_MAX 0 + +/* Define to 1 if you have the declaration of `INT64_MIN', and to 0 if you + don't. */ +#define HAVE_DECL_INT64_MIN 0 + +/* Define to 1 if you have the declaration of `SIZE_MAX', and to 0 if you + don't. */ +#define HAVE_DECL_SIZE_MAX 1 + +/* Define to 1 if you have the declaration of `SSIZE_MAX', and to 0 if you + don't. */ +#define HAVE_DECL_SSIZE_MAX 1 + +/* Define to 1 if you have the declaration of `strerror_r', and to 0 if you + don't. */ +#define HAVE_DECL_STRERROR_R 0 + +/* Define to 1 if you have the declaration of `UINT32_MAX', and to 0 if you + don't. */ +#define HAVE_DECL_UINT32_MAX 1 + +/* Define to 1 if you have the declaration of `UINT64_MAX', and to 0 if you + don't. */ +#define HAVE_DECL_UINT64_MAX 0 + +/* Define to 1 if you have the header file, and it defines `DIR'. + */ +#define HAVE_DIRENT_H 1 + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_DLFCN_H */ + +/* Define to 1 if you don't have `vprintf' but do have `_doprnt.' */ +#define HAVE_DOPRNT 1 + +/* Define to 1 if nl_langinfo supports D_MD_ORDER */ +/* #undef HAVE_D_MD_ORDER */ + +/* A possible errno value for invalid file format errors */ +/* #undef HAVE_EFTYPE */ + +/* A possible errno value for invalid file format errors */ +#define HAVE_EILSEQ 1 + +/* Define to 1 if you have the header file. */ +#define HAVE_ERRNO_H 1 + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_EXPAT_H */ + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_EXT2FS_EXT2_FS_H */ + +/* Define to 1 if you have the `extattr_get_file' function. */ +/* #undef HAVE_EXTATTR_GET_FILE */ + +/* Define to 1 if you have the `extattr_list_file' function. */ +/* #undef HAVE_EXTATTR_LIST_FILE */ + +/* Define to 1 if you have the `extattr_set_fd' function. */ +/* #undef HAVE_EXTATTR_SET_FD */ + +/* Define to 1 if you have the `extattr_set_file' function. */ +/* #undef HAVE_EXTATTR_SET_FILE */ + +/* Define to 1 if you have the `fchdir' function. */ +#define HAVE_FCHDIR 1 + +/* Define to 1 if you have the `fchflags' function. */ +/* #undef HAVE_FCHFLAGS */ + +/* Define to 1 if you have the `fchmod' function. */ +#define HAVE_FCHMOD 1 + +/* Define to 1 if you have the `fchown' function. */ +#define HAVE_FCHOWN 1 + +/* Define to 1 if you have the `fcntl' function. */ +#define HAVE_FCNTL 1 + +/* Define to 1 if you have the header file. */ +#define HAVE_FCNTL_H 1 + +/* Define to 1 if you have the `fork' function. */ +#define HAVE_FORK 1 + +/* Define to 1 if fseeko (and presumably ftello) exists and is declared. */ +/* #undef HAVE_FSEEKO */ + +/* Define to 1 if you have the `fsetxattr' function. */ +/* #undef HAVE_FSETXATTR */ + +/* Define to 1 if you have the `fstat' function. */ +#define HAVE_FSTAT 1 + +/* Define to 1 if you have the `ftruncate' function. */ +#define HAVE_FTRUNCATE 1 + +/* Define to 1 if you have the `futimens' function. */ +/* #undef HAVE_FUTIMENS */ + +/* Define to 1 if you have the `futimes' function. */ +/* #undef HAVE_FUTIMES */ + +/* Define to 1 if you have the `geteuid' function. */ +#define HAVE_GETEUID 1 + +/* Define to 1 if you have the `getgrgid_r' function. */ +/* #undef HAVE_GETGRGID_R */ + +/* Define to 1 if you have the `getgrnam_r' function. */ +/* #undef HAVE_GETGRNAM_R */ + +/* Define to 1 if you have the `getpid' function. */ +#define HAVE_GETPID 1 + +/* Define to 1 if you have the `getpwnam_r' function. */ +/* #undef HAVE_GETPWNAM_R */ + +/* Define to 1 if you have the `getpwuid_r' function. */ +/* #undef HAVE_GETPWUID_R */ + +/* Define to 1 if you have the `getxattr' function. */ +/* #undef HAVE_GETXATTR */ + +/* Define to 1 if you have the header file. */ +#define HAVE_GRP_H 1 + +/* Define to 1 if the system has the type `intmax_t'. */ +#define HAVE_INTMAX_T 1 + +/* Define to 1 if you have the header file. */ +#define HAVE_INTTYPES_H 1 + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_IO_H */ + +/* Define to 1 if you have the header file. */ +#define HAVE_LANGINFO_H 1 + +/* Define to 1 if you have the `lchflags' function. */ +/* #undef HAVE_LCHFLAGS */ + +/* Define to 1 if you have the `lchmod' function. */ +/* #undef HAVE_LCHMOD */ + +/* Define to 1 if you have the `lchown' function. */ +/* #undef HAVE_LCHOWN */ + +/* Define to 1 if you have the `lgetxattr' function. */ +/* #undef HAVE_LGETXATTR */ + +/* Define to 1 if you have the `acl' library (-lacl). */ +/* #undef HAVE_LIBACL */ + +/* Define to 1 if you have the `attr' library (-lattr). */ +/* #undef HAVE_LIBATTR */ + +/* Define to 1 if you have the `bz2' library (-lbz2). */ +#define HAVE_LIBBZ2 1 + +/* Define to 1 if you have the `expat' library (-lexpat). */ +/* #undef HAVE_LIBEXPAT */ + +/* Define to 1 if you have the `lzma' library (-llzma). */ +/* #undef HAVE_LIBLZMA */ + +/* Define to 1 if you have the `lzmadec' library (-llzmadec). */ +/* #undef HAVE_LIBLZMADEC */ + +/* Define to 1 if you have the `xml2' library (-lxml2). */ +/* #undef HAVE_LIBXML2 */ + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_LIBXML_XMLREADER_H */ + +/* Define to 1 if you have the `z' library (-lz). */ +#define HAVE_LIBZ 1 + +/* Define to 1 if you have the header file. */ +#define HAVE_LIMITS_H 1 + +/* Define to 1 if you have the `link' function. */ +#define HAVE_LINK 1 + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_LINUX_FS_H */ + +/* Define to 1 if you have the `listxattr' function. */ +/* #undef HAVE_LISTXATTR */ + +/* Define to 1 if you have the `llistxattr' function. */ +/* #undef HAVE_LLISTXATTR */ + +/* Define to 1 if you have the header file. */ +#define HAVE_LOCALE_H 1 + +/* Define to 1 if the system has the type `long long int'. */ +/* #undef HAVE_LONG_LONG_INT */ + +/* Define to 1 if you have the `lsetxattr' function. */ +/* #undef HAVE_LSETXATTR */ + +/* Define to 1 if you have the `lstat' function. */ +#define HAVE_LSTAT 1 + +/* Define to 1 if `lstat' has the bug that it succeeds when given the + zero-length file name argument. */ +/* #undef HAVE_LSTAT_EMPTY_STRING_BUG */ + +/* Define to 1 if you have the `lutimes' function. */ +/* #undef HAVE_LUTIMES */ + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_LZMADEC_H */ + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_LZMA_H */ + +/* Define to 1 if you have the `MD5Init' function. */ +/* #undef HAVE_MD5INIT */ + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_MD5_H */ + +/* Define to 1 if you have the `memmove' function. */ +#define HAVE_MEMMOVE 1 + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_MEMORY_H */ + +/* Define to 1 if you have the `memset' function. */ +#define HAVE_MEMSET 1 + +/* Define to 1 if you have the `mkdir' function. */ +#define HAVE_MKDIR 1 + +/* Define to 1 if you have the `mkfifo' function. */ +#define HAVE_MKFIFO 1 + +/* Define to 1 if you have the `mknod' function. */ +#define HAVE_MKNOD 1 + +/* Define to 1 if you have the header file, and it defines `DIR'. */ +/* #undef HAVE_NDIR_H */ + +/* Define to 1 if you have the `nl_langinfo' function. */ +/* #undef HAVE_NL_LANGINFO */ + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_OPENSSL_MD5_H */ + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_OPENSSL_RIPEMD_H */ + +/* Define to 1 if your openssl has the `SHA256_Init' function. */ +/* #undef HAVE_OPENSSL_SHA256_INIT */ + +/* Define to 1 if your openssl has the `SHA384_Init' function. */ +/* #undef HAVE_OPENSSL_SHA384_INIT */ + +/* Define to 1 if your openssl has the `SHA512_Init' function. */ +/* #undef HAVE_OPENSSL_SHA512_INIT */ + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_OPENSSL_SHA_H */ + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_PATHS_H */ + +/* Define to 1 if you have the `pipe' function. */ +#define HAVE_PIPE 1 + +/* Define to 1 if you have the `poll' function. */ +/* #undef HAVE_POLL */ + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_POLL_H */ + +/* Define to 1 if you have the header file. */ +#define HAVE_PWD_H 1 + +/* Define to 1 if you have the `readlink' function. */ +#define HAVE_READLINK 1 + +/* Define to 1 if you have the header file. */ +#define HAVE_REGEX_H 1 + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_RIPEMD_H */ + +/* Define to 1 if you have the `RMD160Init' function. */ +/* #undef HAVE_RMD160INIT */ + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_RMD160_H */ + +/* Define to 1 if you have the `select' function. */ +#define HAVE_SELECT 1 + +/* Define to 1 if you have the `setenv' function. */ +#define HAVE_SETENV 1 + +/* Define to 1 if you have the `setlocale' function. */ +#define HAVE_SETLOCALE 1 + +/* Define to 1 if you have the `SHA1Init' function. */ +/* #undef HAVE_SHA1INIT */ + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_SHA1_H */ + +/* Define to 1 if you have the `SHA256Init' function. */ +/* #undef HAVE_SHA256INIT */ + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_SHA256_H */ + +/* Define to 1 if you have the `SHA256_Init' function. */ +/* #undef HAVE_SHA256_INIT */ + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_SHA2_H */ + +/* Define to 1 if you have the `SHA384Init' function. */ +/* #undef HAVE_SHA384INIT */ + +/* Define to 1 if you have the `SHA384_Init' function. */ +/* #undef HAVE_SHA384_INIT */ + +/* Define to 1 if you have the `SHA512Init' function. */ +/* #undef HAVE_SHA512INIT */ + +/* Define to 1 if you have the `SHA512_Init' function. */ +/* #undef HAVE_SHA512_INIT */ + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_SHA_H */ + +/* Define to 1 if you have the `sigaction' function. */ +#define HAVE_SIGACTION 1 + +/* Define to 1 if you have the header file. */ +#define HAVE_SIGNAL_H 1 + +/* Define to 1 if `stat' has the bug that it succeeds when given the + zero-length file name argument. */ +/* #undef HAVE_STAT_EMPTY_STRING_BUG */ + +/* Define to 1 if you have the header file. */ +#define HAVE_STDARG_H 1 + +/* Define to 1 if you have the header file. */ +#define HAVE_STDINT_H 1 + +/* Define to 1 if you have the header file. */ +#define HAVE_STDLIB_H 1 + +/* Define to 1 if you have the `strchr' function. */ +#define HAVE_STRCHR 1 + +/* Define to 1 if you have the `strdup' function. */ +#define HAVE_STRDUP 1 + +/* Define to 1 if you have the `strerror' function. */ +#define HAVE_STRERROR 1 + +/* Define to 1 if you have the `strerror_r' function. */ +/* #undef HAVE_STRERROR_R */ + +/* Define to 1 if you have the `strftime' function. */ +#define HAVE_STRFTIME 1 + +/* Define to 1 if you have the header file. */ +#define HAVE_STRINGS_H 1 + +/* Define to 1 if you have the header file. */ +#define HAVE_STRING_H 1 + +/* Define to 1 if you have the `strncpy_s' function. */ +/* #undef HAVE_STRNCPY_S */ + +/* Define to 1 if you have the `strrchr' function. */ +#define HAVE_STRRCHR 1 + +/* Define to 1 if `st_birthtime' is a member of `struct stat'. */ +/* #undef HAVE_STRUCT_STAT_ST_BIRTHTIME */ + +/* Define to 1 if `st_birthtimespec.tv_nsec' is a member of `struct stat'. */ +/* #undef HAVE_STRUCT_STAT_ST_BIRTHTIMESPEC_TV_NSEC */ + +/* Define to 1 if `st_blksize' is a member of `struct stat'. */ +/* #undef HAVE_STRUCT_STAT_ST_BLKSIZE */ + +/* Define to 1 if `st_flags' is a member of `struct stat'. */ +/* #undef HAVE_STRUCT_STAT_ST_FLAGS */ + +/* Define to 1 if `st_mtimespec.tv_nsec' is a member of `struct stat'. */ +/* #undef HAVE_STRUCT_STAT_ST_MTIMESPEC_TV_NSEC */ + +/* Define to 1 if `st_mtime_n' is a member of `struct stat'. */ +/* #undef HAVE_STRUCT_STAT_ST_MTIME_N */ + +/* Define to 1 if `st_mtime_usec' is a member of `struct stat'. */ +/* #undef HAVE_STRUCT_STAT_ST_MTIME_USEC */ + +/* Define to 1 if `st_mtim.tv_nsec' is a member of `struct stat'. */ +/* #undef HAVE_STRUCT_STAT_ST_MTIM_TV_NSEC */ + +/* Define to 1 if `st_umtime' is a member of `struct stat'. */ +/* #undef HAVE_STRUCT_STAT_ST_UMTIME */ + +/* Define to 1 if you have the `symlink' function. */ +#define HAVE_SYMLINK 1 + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_SYS_ACL_H */ + +/* Define to 1 if you have the header file. */ +#define HAVE_SYS_CDEFS_H 1 + +/* Define to 1 if you have the header file, and it defines `DIR'. + */ +/* #undef HAVE_SYS_DIR_H */ + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_SYS_EXTATTR_H */ + +/* Define to 1 if you have the header file. */ +#define HAVE_SYS_IOCTL_H 1 + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_SYS_MKDEV_H */ + +/* Define to 1 if you have the header file, and it defines `DIR'. + */ +/* #undef HAVE_SYS_NDIR_H */ + +/* Define to 1 if you have the header file. */ +#define HAVE_SYS_PARAM_H 1 + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_SYS_POLL_H */ + +/* Define to 1 if you have the header file. */ +#define HAVE_SYS_SELECT_H 1 + +/* Define to 1 if you have the header file. */ +#define HAVE_SYS_STAT_H 1 + +/* Define to 1 if you have the header file. */ +#define HAVE_SYS_TIME_H 1 + +/* Define to 1 if you have the header file. */ +#define HAVE_SYS_TYPES_H 1 + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_SYS_UTIME_H */ + +/* Define to 1 if you have that is POSIX.1 compatible. */ +#define HAVE_SYS_WAIT_H 1 + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_SYS_XATTR_H */ + +/* Define to 1 if you have the `timegm' function. */ +#define HAVE_TIMEGM 1 + +/* Define to 1 if you have the header file. */ +#define HAVE_TIME_H 1 + +/* Define to 1 if you have the `tzset' function. */ +#define HAVE_TZSET 1 + +/* Define to 1 if the system has the type `uintmax_t'. */ +#define HAVE_UINTMAX_T 1 + +/* Define to 1 if you have the header file. */ +#define HAVE_UNISTD_H 1 + +/* Define to 1 if you have the `unsetenv' function. */ +#define HAVE_UNSETENV 1 + +/* Define to 1 if the system has the type `unsigned long long'. */ +/* #undef HAVE_UNSIGNED_LONG_LONG */ + +/* Define to 1 if the system has the type `unsigned long long int'. */ +/* #undef HAVE_UNSIGNED_LONG_LONG_INT */ + +/* Define to 1 if you have the `utime' function. */ +#define HAVE_UTIME 1 + +/* Define to 1 if you have the `utimensat' function. */ +/* #undef HAVE_UTIMENSAT */ + +/* Define to 1 if you have the `utimes' function. */ +/* #undef HAVE_UTIMES */ + +/* Define to 1 if you have the header file. */ +#define HAVE_UTIME_H 1 + +/* Define to 1 if you have the `vfork' function. */ +/* #undef HAVE_VFORK */ + +/* Define to 1 if you have the `vprintf' function. */ +#define HAVE_VPRINTF 1 + +/* Define to 1 if you have the header file. */ +#define HAVE_WCHAR_H 1 + +/* Define to 1 if the system has the type `wchar_t'. */ +#define HAVE_WCHAR_T 1 + +/* Define to 1 if you have the `wcrtomb' function. */ +/* #undef HAVE_WCRTOMB */ + +/* Define to 1 if you have the `wcscpy' function. */ +#define HAVE_WCSCPY 1 + +/* Define to 1 if you have the `wcslen' function. */ +#define HAVE_WCSLEN 1 + +/* Define to 1 if you have the `wctomb' function. */ +#define HAVE_WCTOMB 1 + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_WCTYPE_H */ + +/* Define to 1 if you have the header file. */ +/* #undef HAVE_WINDOWS_H */ + +/* Define to 1 if you have the `wmemcmp' function. */ +#define HAVE_WMEMCMP 1 + +/* Define to 1 if you have the `wmemcpy' function. */ +#define HAVE_WMEMCPY 1 + +/* Define to 1 if you have the header file. */ +#define HAVE_ZLIB_H 1 + +/* Version number of libarchive as a single integer */ +#define LIBARCHIVE_VERSION_NUMBER "2008003" + +/* Version number of libarchive */ +#define LIBARCHIVE_VERSION_STRING "2.8.3" + +/* Define to 1 if `lstat' dereferences a symlink specified with a trailing + slash. */ +#define LSTAT_FOLLOWS_SLASHED_SYMLINK 1 + +/* Define to the sub-directory in which libtool stores uninstalled libraries. + */ +#define LT_OBJDIR ".libs/" + +/* Define to 1 if `major', `minor', and `makedev' are declared in . + */ +/* #undef MAJOR_IN_MKDEV */ + +/* Define to 1 if `major', `minor', and `makedev' are declared in + . */ +/* #undef MAJOR_IN_SYSMACROS */ + +/* Define to 1 if your C compiler doesn't accept -c and -o together. */ +/* #undef NO_MINUS_C_MINUS_O */ + +/* Name of package */ +#define PACKAGE "libarchive" + +/* Define to the address where bug reports for this package should be sent. */ +#define PACKAGE_BUGREPORT "kientzle@freebsd.org" + +/* Define to the full name of this package. */ +#define PACKAGE_NAME "libarchive" + +/* Define to the full name and version of this package. */ +#define PACKAGE_STRING "libarchive 2.8.3" + +/* Define to the one symbol short name of this package. */ +#define PACKAGE_TARNAME "libarchive" + +/* Define to the home page for this package. */ +#define PACKAGE_URL "" + +/* Define to the version of this package. */ +#define PACKAGE_VERSION "2.8.3" + +/* The size of `wchar_t', as computed by sizeof. */ +#define SIZEOF_WCHAR_T 1 + +/* Define to 1 if you have the ANSI C header files. */ +#define STDC_HEADERS 1 + +/* Define to 1 if strerror_r returns char *. */ +/* #undef STRERROR_R_CHAR_P */ + +/* Define to 1 if you can safely include both and . */ +#define TIME_WITH_SYS_TIME 1 + +/* Enable extensions on AIX 3, Interix. */ +#ifndef _ALL_SOURCE +# define _ALL_SOURCE 1 +#endif +/* Enable GNU extensions on systems that have them. */ +#ifndef _GNU_SOURCE +# define _GNU_SOURCE 1 +#endif +/* Enable threading extensions on Solaris. */ +#ifndef _POSIX_PTHREAD_SEMANTICS +# define _POSIX_PTHREAD_SEMANTICS 1 +#endif +/* Enable extensions on HP NonStop. */ +#ifndef _TANDEM_SOURCE +# define _TANDEM_SOURCE 1 +#endif +/* Enable general extensions on Solaris. */ +#ifndef __EXTENSIONS__ +# define __EXTENSIONS__ 1 +#endif + + +/* Version number of package */ +#define VERSION "2.8.3" + +/* Define to '0x0500' for Windows 2000 APIs. */ +/* #undef WINVER */ + +/* Number of bits in a file offset, on hosts where this is settable. */ +/* #undef _FILE_OFFSET_BITS */ + +/* Define to 1 to make fseeko visible on some hosts (e.g. glibc 2.2). */ +/* #undef _LARGEFILE_SOURCE */ + +/* Define for large files, on AIX-style hosts. */ +/* #undef _LARGE_FILES */ + +/* Define to 1 if on MINIX. */ +#define _MINIX 1 + +/* Define to 2 if the system does not provide POSIX.1 features except with + this defined. */ +#define _POSIX_1_SOURCE 2 + +/* Define to 1 if you need to in order for `stat' and other things to work. */ +#define _POSIX_SOURCE 1 + +/* Define for Solaris 2.5.1 so the uint64_t typedef from , + , or is not used. If the typedef were allowed, the + #define below would cause a syntax error. */ +/* #undef _UINT64_T */ + +/* Define to '0x0500' for Windows 2000 APIs. */ +/* #undef _WIN32_WINNT */ + +/* Define to empty if `const' does not conform to ANSI C. */ +/* #undef const */ + +/* Define to match typeof st_gid field of struct stat if doesn't + define. */ +/* #undef gid_t */ + +/* Define to `unsigned long' if does not define. */ +#define id_t unsigned long + +/* Define to the type of a signed integer type of width exactly 64 bits if + such a type exists and the standard includes do not define it. */ +/* #undef int64_t */ + +/* Define to the widest signed integer type if and do + not define. */ +/* #undef intmax_t */ + +/* Define to `int' if does not define. */ +/* #undef mode_t */ + +/* Define to `long long' if does not define. */ +/* #undef off_t */ + +/* Define to `unsigned int' if does not define. */ +/* #undef size_t */ + +/* Define to match typeof st_uid field of struct stat if doesn't + define. */ +/* #undef uid_t */ + +/* Define to the type of an unsigned integer type of width exactly 64 bits if + such a type exists and the standard includes do not define it. */ +/* #undef uint64_t */ + +/* Define to the widest unsigned integer type if and + do not define. */ +/* #undef uintmax_t */ + +/* Define to `unsigned int' if does not define. */ +/* #undef uintptr_t */ diff --git a/lib/libarchive/filter_fork.c b/lib/libarchive/filter_fork.c new file mode 100644 index 000000000..7b278c794 --- /dev/null +++ b/lib/libarchive/filter_fork.c @@ -0,0 +1,165 @@ +/*- + * Copyright (c) 2007 Joerg Sonnenberger + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include "archive_platform.h" + +/* This capability is only available on POSIX systems. */ +#if defined(HAVE_PIPE) && defined(HAVE_FCNTL) && \ + (defined(HAVE_FORK) || defined(HAVE_VFORK)) + +__FBSDID("$FreeBSD: head/lib/libarchive/filter_fork.c 182958 2008-09-12 05:33:00Z kientzle $"); + +#if defined(HAVE_POLL) +# if defined(HAVE_POLL_H) +# include +# elif defined(HAVE_SYS_POLL_H) +# include +# else +# undef HAVE_POLL +# endif +#endif + +#if defined(HAVE_SELECT) && !defined(HAVE_POLL) +# if defined(HAVE_SYS_SELECT_H) +# include +# elif defined(HAVE_UNISTD_H) +# include +# endif +#endif +#ifdef HAVE_FCNTL_H +# include +#endif +#ifdef HAVE_UNISTD_H +# include +#endif + +#include "filter_fork.h" + +pid_t +__archive_create_child(const char *path, int *child_stdin, int *child_stdout) +{ + pid_t child; + int stdin_pipe[2], stdout_pipe[2], tmp; + + if (pipe(stdin_pipe) == -1) + goto state_allocated; + if (stdin_pipe[0] == 1 /* stdout */) { + if ((tmp = dup(stdin_pipe[0])) == -1) + goto stdin_opened; + close(stdin_pipe[0]); + stdin_pipe[0] = tmp; + } + if (pipe(stdout_pipe) == -1) + goto stdin_opened; + if (stdout_pipe[1] == 0 /* stdin */) { + if ((tmp = dup(stdout_pipe[1])) == -1) + goto stdout_opened; + close(stdout_pipe[1]); + stdout_pipe[1] = tmp; + } + +#if HAVE_VFORK + switch ((child = vfork())) { +#else + switch ((child = fork())) { +#endif + case -1: + goto stdout_opened; + case 0: + close(stdin_pipe[1]); + close(stdout_pipe[0]); + if (dup2(stdin_pipe[0], 0 /* stdin */) == -1) + _exit(254); + if (stdin_pipe[0] != 0 /* stdin */) + close(stdin_pipe[0]); + if (dup2(stdout_pipe[1], 1 /* stdout */) == -1) + _exit(254); + if (stdout_pipe[1] != 1 /* stdout */) + close(stdout_pipe[1]); + execlp(path, path, (char *)NULL); + _exit(254); + default: + close(stdin_pipe[0]); + close(stdout_pipe[1]); + + *child_stdin = stdin_pipe[1]; + fcntl(*child_stdin, F_SETFL, O_NONBLOCK); + *child_stdout = stdout_pipe[0]; + fcntl(*child_stdout, F_SETFL, O_NONBLOCK); + } + + return child; + +stdout_opened: + close(stdout_pipe[0]); + close(stdout_pipe[1]); +stdin_opened: + close(stdin_pipe[0]); + close(stdin_pipe[1]); +state_allocated: + return -1; +} + +void +__archive_check_child(int in, int out) +{ +#if defined(HAVE_POLL) + struct pollfd fds[2]; + int idx; + + idx = 0; + if (in != -1) { + fds[idx].fd = in; + fds[idx].events = POLLOUT; + ++idx; + } + if (out != -1) { + fds[idx].fd = out; + fds[idx].events = POLLIN; + ++idx; + } + + poll(fds, idx, -1); /* -1 == INFTIM, wait forever */ +#elif defined(HAVE_SELECT) + fd_set fds_in, fds_out, fds_error; + + FD_ZERO(&fds_in); + FD_ZERO(&fds_out); + FD_ZERO(&fds_error); + if (out != -1) { + FD_SET(out, &fds_in); + FD_SET(out, &fds_error); + } + if (in != -1) { + FD_SET(in, &fds_out); + FD_SET(in, &fds_error); + } + select(in < out ? out + 1 : in + 1, &fds_in, &fds_out, &fds_error, NULL); +#else + sleep(1); +#endif +} + +#endif /* defined(HAVE_PIPE) && defined(HAVE_VFORK) && defined(HAVE_FCNTL) */ diff --git a/lib/libarchive/filter_fork.h b/lib/libarchive/filter_fork.h new file mode 100644 index 000000000..453d032d1 --- /dev/null +++ b/lib/libarchive/filter_fork.h @@ -0,0 +1,41 @@ +/*- + * Copyright (c) 2007 Joerg Sonnenberger + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) ``AS IS'' AND ANY EXPRESS OR + * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + * IN NO EVENT SHALL THE AUTHOR(S) BE LIABLE FOR ANY DIRECT, INDIRECT, + * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF + * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * $FreeBSD: head/lib/libarchive/filter_fork.h 201087 2009-12-28 02:18:26Z kientzle $ + */ + +#ifndef __LIBARCHIVE_BUILD +#error This header is only to be used internally to libarchive. +#endif + +#ifndef FILTER_FORK_H +#define FILTER_FORK_H + +pid_t +__archive_create_child(const char *path, int *child_stdin, int *child_stdout); + +void +__archive_check_child(int in, int out); + +#endif diff --git a/lib/libarchive/libarchive-formats.5 b/lib/libarchive/libarchive-formats.5 new file mode 100644 index 000000000..0acdb50c2 --- /dev/null +++ b/lib/libarchive/libarchive-formats.5 @@ -0,0 +1,355 @@ +.\" Copyright (c) 2003-2009 Tim Kientzle +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD: head/lib/libarchive/libarchive-formats.5 201077 2009-12-28 01:50:23Z kientzle $ +.\" +.Dd December 27, 2009 +.Dt libarchive-formats 5 +.Os +.Sh NAME +.Nm libarchive-formats +.Nd archive formats supported by the libarchive library +.Sh DESCRIPTION +The +.Xr libarchive 3 +library reads and writes a variety of streaming archive formats. +Generally speaking, all of these archive formats consist of a series of +.Dq entries . +Each entry stores a single file system object, such as a file, directory, +or symbolic link. +.Pp +The following provides a brief description of each format supported +by libarchive, with some information about recognized extensions or +limitations of the current library support. +Note that just because a format is supported by libarchive does not +imply that a program that uses libarchive will support that format. +Applications that use libarchive specify which formats they wish +to support, though many programs do use libarchive convenience +functions to enable all supported formats. +.Ss Tar Formats +The +.Xr libarchive 3 +library can read most tar archives. +However, it only writes POSIX-standard +.Dq ustar +and +.Dq pax interchange +formats. +.Pp +All tar formats store each entry in one or more 512-byte records. +The first record is used for file metadata, including filename, +timestamp, and mode information, and the file data is stored in +subsequent records. +Later variants have extended this by either appropriating undefined +areas of the header record, extending the header to multiple records, +or by storing special entries that modify the interpretation of +subsequent entries. +.Pp +.Bl -tag -width indent +.It Cm gnutar +The +.Xr libarchive 3 +library can read GNU-format tar archives. +It currently supports the most popular GNU extensions, including +modern long filename and linkname support, as well as atime and ctime data. +The libarchive library does not support multi-volume +archives, nor the old GNU long filename format. +It can read GNU sparse file entries, including the new POSIX-based +formats, but cannot write GNU sparse file entries. +.It Cm pax +The +.Xr libarchive 3 +library can read and write POSIX-compliant pax interchange format +archives. +Pax interchange format archives are an extension of the older ustar +format that adds a separate entry with additional attributes stored +as key/value pairs immediately before each regular entry. +The presence of these additional entries is the only difference between +pax interchange format and the older ustar format. +The extended attributes are of unlimited length and are stored +as UTF-8 Unicode strings. +Keywords defined in the standard are in all lowercase; vendors are allowed +to define custom keys by preceding them with the vendor name in all uppercase. +When writing pax archives, libarchive uses many of the SCHILY keys +defined by Joerg Schilling's +.Dq star +archiver and a few LIBARCHIVE keys. +The libarchive library can read most of the SCHILY keys +and most of the GNU keys introduced by GNU tar. +It silently ignores any keywords that it does not understand. +.It Cm restricted pax +The libarchive library can also write pax archives in which it +attempts to suppress the extended attributes entry whenever +possible. +The result will be identical to a ustar archive unless the +extended attributes entry is required to store a long file +name, long linkname, extended ACL, file flags, or if any of the standard +ustar data (user name, group name, UID, GID, etc) cannot be fully +represented in the ustar header. +In all cases, the result can be dearchived by any program that +can read POSIX-compliant pax interchange format archives. +Programs that correctly read ustar format (see below) will also be +able to read this format; any extended attributes will be extracted as +separate files stored in +.Pa PaxHeader +directories. +.It Cm ustar +The libarchive library can both read and write this format. +This format has the following limitations: +.Bl -bullet -compact +.It +Device major and minor numbers are limited to 21 bits. +Nodes with larger numbers will not be added to the archive. +.It +Path names in the archive are limited to 255 bytes. +(Shorter if there is no / character in exactly the right place.) +.It +Symbolic links and hard links are stored in the archive with +the name of the referenced file. +This name is limited to 100 bytes. +.It +Extended attributes, file flags, and other extended +security information cannot be stored. +.It +Archive entries are limited to 8 gigabytes in size. +.El +Note that the pax interchange format has none of these restrictions. +.El +.Pp +The libarchive library also reads a variety of commonly-used extensions to +the basic tar format. +These extensions are recognized automatically whenever they appear. +.Bl -tag -width indent +.It Numeric extensions. +The POSIX standards require fixed-length numeric fields to be written with +some character position reserved for terminators. +Libarchive allows these fields to be written without terminator characters. +This extends the allowable range; in particular, ustar archives with this +extension can support entries up to 64 gigabytes in size. +Libarchive also recognizes base-256 values in most numeric fields. +This essentially removes all limitations on file size, modification time, +and device numbers. +.It Solaris extensions +Libarchive recognizes ACL and extended attribute records written +by Solaris tar. +Currently, libarchive only has support for old-style ACLs; the +newer NFSv4 ACLs are recognized but discarded. +.El +.Pp +The first tar program appeared in Seventh Edition Unix in 1979. +The first official standard for the tar file format was the +.Dq ustar +(Unix Standard Tar) format defined by POSIX in 1988. +POSIX.1-2001 extended the ustar format to create the +.Dq pax interchange +format. +.Ss Cpio Formats +The libarchive library can read a number of common cpio variants and can write +.Dq odc +and +.Dq newc +format archives. +A cpio archive stores each entry as a fixed-size header followed +by a variable-length filename and variable-length data. +Unlike the tar format, the cpio format does only minimal padding +of the header or file data. +There are several cpio variants, which differ primarily in +how they store the initial header: some store the values as +octal or hexadecimal numbers in ASCII, others as binary values of +varying byte order and length. +.Bl -tag -width indent +.It Cm binary +The libarchive library transparently reads both big-endian and little-endian +variants of the original binary cpio format. +This format used 32-bit binary values for file size and mtime, +and 16-bit binary values for the other fields. +.It Cm odc +The libarchive library can both read and write this +POSIX-standard format, which is officially known as the +.Dq cpio interchange format +or the +.Dq octet-oriented cpio archive format +and sometimes unofficially referred to as the +.Dq old character format . +This format stores the header contents as octal values in ASCII. +It is standard, portable, and immune from byte-order confusion. +File sizes and mtime are limited to 33 bits (8GB file size), +other fields are limited to 18 bits. +.It Cm SVR4 +The libarchive library can read both CRC and non-CRC variants of +this format. +The SVR4 format uses eight-digit hexadecimal values for +all header fields. +This limits file size to 4GB, and also limits the mtime and +other fields to 32 bits. +The SVR4 format can optionally include a CRC of the file +contents, although libarchive does not currently verify this CRC. +.El +.Pp +Cpio first appeared in PWB/UNIX 1.0, which was released within +AT&T in 1977. +PWB/UNIX 1.0 formed the basis of System III Unix, released outside +of AT&T in 1981. +This makes cpio older than tar, although cpio was not included +in Version 7 AT&T Unix. +As a result, the tar command became much better known in universities +and research groups that used Version 7. +The combination of the +.Nm find +and +.Nm cpio +utilities provided very precise control over file selection. +Unfortunately, the format has many limitations that make it unsuitable +for widespread use. +Only the POSIX format permits files over 4GB, and its 18-bit +limit for most other fields makes it unsuitable for modern systems. +In addition, cpio formats only store numeric UID/GID values (not +usernames and group names), which can make it very difficult to correctly +transfer archives across systems with dissimilar user numbering. +.Ss Shar Formats +A +.Dq shell archive +is a shell script that, when executed on a POSIX-compliant +system, will recreate a collection of file system objects. +The libarchive library can write two different kinds of shar archives: +.Bl -tag -width indent +.It Cm shar +The traditional shar format uses a limited set of POSIX +commands, including +.Xr echo 1 , +.Xr mkdir 1 , +and +.Xr sed 1 . +It is suitable for portably archiving small collections of plain text files. +However, it is not generally well-suited for large archives +(many implementations of +.Xr sh 1 +have limits on the size of a script) nor should it be used with non-text files. +.It Cm shardump +This format is similar to shar but encodes files using +.Xr uuencode 1 +so that the result will be a plain text file regardless of the file contents. +It also includes additional shell commands that attempt to reproduce as +many file attributes as possible, including owner, mode, and flags. +The additional commands used to restore file attributes make +shardump archives less portable than plain shar archives. +.El +.Ss ISO9660 format +Libarchive can read and extract from files containing ISO9660-compliant +CDROM images. +In many cases, this can remove the need to burn a physical CDROM +just in order to read the files contained in an ISO9660 image. +It also avoids security and complexity issues that come with +virtual mounts and loopback devices. +Libarchive supports the most common Rockridge extensions and has partial +support for Joliet extensions. +If both extensions are present, the Joliet extensions will be +used and the Rockridge extensions will be ignored. +In particular, this can create problems with hardlinks and symlinks, +which are supported by Rockridge but not by Joliet. +.Ss Zip format +Libarchive can read and write zip format archives that have +uncompressed entries and entries compressed with the +.Dq deflate +algorithm. +Older zip compression algorithms are not supported. +It can extract jar archives, archives that use Zip64 extensions and many +self-extracting zip archives. +Libarchive reads Zip archives as they are being streamed, +which allows it to read archives of arbitrary size. +It currently does not use the central directory; this +limits libarchive's ability to support some self-extracting +archives and ones that have been modified in certain ways. +.Ss Archive (library) file format +The Unix archive format (commonly created by the +.Xr ar 1 +archiver) is a general-purpose format which is +used almost exclusively for object files to be +read by the link editor +.Xr ld 1 . +The ar format has never been standardised. +There are two common variants: +the GNU format derived from SVR4, +and the BSD format, which first appeared in 4.4BSD. +The two differ primarily in their handling of filenames +longer than 15 characters: +the GNU/SVR4 variant writes a filename table at the beginning of the archive; +the BSD format stores each long filename in an extension +area adjacent to the entry. +Libarchive can read both extensions, +including archives that may include both types of long filenames. +Programs using libarchive can write GNU/SVR4 format +if they provide a filename table to be written into +the archive before any of the entries. +Any entries whose names are not in the filename table +will be written using BSD-style long filenames. +This can cause problems for programs such as +GNU ld that do not support the BSD-style long filenames. +.Ss mtree +Libarchive can read and write files in +.Xr mtree 5 +format. +This format is not a true archive format, but rather a textual description +of a file hierarchy in which each line specifies the name of a file and +provides specific metadata about that file. +Libarchive can read all of the keywords supported by both +the NetBSD and FreeBSD versions of +.Xr mtree 1 , +although many of the keywords cannot currently be stored in an +.Tn archive_entry +object. +When writing, libarchive supports use of the +.Xr archive_write_set_options 3 +interface to specify which keywords should be included in the +output. +If libarchive was compiled with access to suitable +cryptographic libraries (such as the OpenSSL libraries), +it can compute hash entries such as +.Cm sha512 +or +.Cm md5 +from file data being written to the mtree writer. +.Pp +When reading an mtree file, libarchive will locate the corresponding +files on disk using the +.Cm contents +keyword if present or the regular filename. +If it can locate and open the file on disk, it will use that +to fill in any metadata that is missing from the mtree file +and will read the file contents and return those to the program +using libarchive. +If it cannot locate and open the file on disk, libarchive +will return an error for any attempt to read the entry +body. +.Sh SEE ALSO +.Xr ar 1 , +.Xr cpio 1 , +.Xr mkisofs 1 , +.Xr shar 1 , +.Xr tar 1 , +.Xr zip 1 , +.Xr zlib 3 , +.Xr cpio 5 , +.Xr mtree 5 , +.Xr tar 5 diff --git a/lib/libarchive/libarchive.3 b/lib/libarchive/libarchive.3 new file mode 100644 index 000000000..8c19d008a --- /dev/null +++ b/lib/libarchive/libarchive.3 @@ -0,0 +1,331 @@ +.\" Copyright (c) 2003-2007 Tim Kientzle +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD: src/lib/libarchive/libarchive.3,v 1.11 2007/01/09 08:05:56 kientzle Exp $ +.\" +.Dd August 19, 2006 +.Dt LIBARCHIVE 3 +.Os +.Sh NAME +.Nm libarchive +.Nd functions for reading and writing streaming archives +.Sh LIBRARY +.Lb libarchive +.Sh OVERVIEW +The +.Nm +library provides a flexible interface for reading and writing +streaming archive files such as tar and cpio. +The library is inherently stream-oriented; readers serially iterate through +the archive, writers serially add things to the archive. +In particular, note that there is no built-in support for +random access nor for in-place modification. +.Pp +When reading an archive, the library automatically detects the +format and the compression. +The library currently has read support for: +.Bl -bullet -compact +.It +old-style tar archives, +.It +most variants of the POSIX +.Dq ustar +format, +.It +the POSIX +.Dq pax interchange +format, +.It +GNU-format tar archives, +.It +most common cpio archive formats, +.It +ISO9660 CD images (with or without RockRidge extensions), +.It +Zip archives. +.El +The library automatically detects archives compressed with +.Xr gzip 1 , +.Xr bzip2 1 , +or +.Xr compress 1 +and decompresses them transparently. +.Pp +When writing an archive, you can specify the compression +to be used and the format to use. +The library can write +.Bl -bullet -compact +.It +POSIX-standard +.Dq ustar +archives, +.It +POSIX +.Dq pax interchange format +archives, +.It +POSIX octet-oriented cpio archives, +.It +two different variants of shar archives. +.El +Pax interchange format is an extension of the tar archive format that +eliminates essentially all of the limitations of historic tar formats +in a standard fashion that is supported +by POSIX-compliant +.Xr pax 1 +implementations on many systems as well as several newer implementations of +.Xr tar 1 . +Note that the default write format will suppress the pax extended +attributes for most entries; explicitly requesting pax format will +enable those attributes for all entries. +.Pp +The read and write APIs are accessed through the +.Fn archive_read_XXX +functions and the +.Fn archive_write_XXX +functions, respectively, and either can be used independently +of the other. +.Pp +The rest of this manual page provides an overview of the library +operation. +More detailed information can be found in the individual manual +pages for each API or utility function. +.Sh READING AN ARCHIVE +To read an archive, you must first obtain an initialized +.Tn struct archive +object from +.Fn archive_read_new . +You can then modify this object for the desired operations with the +various +.Fn archive_read_set_XXX +and +.Fn archive_read_support_XXX +functions. +In particular, you will need to invoke appropriate +.Fn archive_read_support_XXX +functions to enable the corresponding compression and format +support. +Note that these latter functions perform two distinct operations: +they cause the corresponding support code to be linked into your +program, and they enable the corresponding auto-detect code. +Unless you have specific constraints, you will generally want +to invoke +.Fn archive_read_support_compression_all +and +.Fn archive_read_support_format_all +to enable auto-detect for all formats and compression types +currently supported by the library. +.Pp +Once you have prepared the +.Tn struct archive +object, you call +.Fn archive_read_open +to actually open the archive and prepare it for reading. +There are several variants of this function; +the most basic expects you to provide pointers to several +functions that can provide blocks of bytes from the archive. +There are convenience forms that allow you to +specify a filename, file descriptor, +.Ft "FILE *" +object, or a block of memory from which to read the archive data. +Note that the core library makes no assumptions about the +size of the blocks read; +callback functions are free to read whatever block size is +most appropriate for the medium. +.Pp +Each archive entry consists of a header followed by a certain +amount of data. +You can obtain the next header with +.Fn archive_read_next_header , +which returns a pointer to an +.Tn struct archive_entry +structure with information about the current archive element. +If the entry is a regular file, then the header will be followed +by the file data. +You can use +.Fn archive_read_data +(which works much like the +.Xr read 2 +system call) +to read this data from the archive. +You may prefer to use the higher-level +.Fn archive_read_data_skip , +which reads and discards the data for this entry, +.Fn archive_read_data_to_buffer , +which reads the data into an in-memory buffer, +.Fn archive_read_data_to_file , +which copies the data to the provided file descriptor, or +.Fn archive_read_extract , +which recreates the specified entry on disk and copies data +from the archive. +In particular, note that +.Fn archive_read_extract +uses the +.Tn struct archive_entry +structure that you provide it, which may differ from the +entry just read from the archive. +In particular, many applications will want to override the +pathname, file permissions, or ownership. +.Pp +Once you have finished reading data from the archive, you +should call +.Fn archive_read_close +to close the archive, then call +.Fn archive_read_finish +to release all resources, including all memory allocated by the library. +.Pp +The +.Xr archive_read 3 +manual page provides more detailed calling information for this API. +.Sh WRITING AN ARCHIVE +You use a similar process to write an archive. +The +.Fn archive_write_new +function creates an archive object useful for writing, +the various +.Fn archive_write_set_XXX +functions are used to set parameters for writing the archive, and +.Fn archive_write_open +completes the setup and opens the archive for writing. +.Pp +Individual archive entries are written in a three-step +process: +You first initialize a +.Tn struct archive_entry +structure with information about the new entry. +At a minimum, you should set the pathname of the +entry and provide a +.Va struct stat +with a valid +.Va st_mode +field, which specifies the type of object and +.Va st_size +field, which specifies the size of the data portion of the object. +The +.Fn archive_write_header +function actually writes the header data to the archive. +You can then use +.Fn archive_write_data +to write the actual data. +.Pp +After all entries have been written, use the +.Fn archive_write_finish +function to release all resources. +.Pp +The +.Xr archive_write 3 +manual page provides more detailed calling information for this API. +.Sh DESCRIPTION +Detailed descriptions of each function are provided by the +corresponding manual pages. +.Pp +All of the functions utilize an opaque +.Tn struct archive +datatype that provides access to the archive contents. +.Pp +The +.Tn struct archive_entry +structure contains a complete description of a single archive +entry. +It uses an opaque interface that is fully documented in +.Xr archive_entry 3 . +.Pp +Users familiar with historic formats should be aware that the newer +variants have eliminated most restrictions on the length of textual fields. +Clients should not assume that filenames, link names, user names, or +group names are limited in length. +In particular, pax interchange format can easily accommodate pathnames +in arbitrary character sets that exceed +.Va PATH_MAX . +.Sh RETURN VALUES +Most functions return zero on success, non-zero on error. +The return value indicates the general severity of the error, ranging +from +.Cm ARCHIVE_WARN , +which indicates a minor problem that should probably be reported +to the user, to +.Cm ARCHIVE_FATAL , +which indicates a serious problem that will prevent any further +operations on this archive. +On error, the +.Fn archive_errno +function can be used to retrieve a numeric error code (see +.Xr errno 2 ) . +The +.Fn archive_error_string +returns a textual error message suitable for display. +.Pp +.Fn archive_read_new +and +.Fn archive_write_new +return pointers to an allocated and initialized +.Tn struct archive +object. +.Pp +.Fn archive_read_data +and +.Fn archive_write_data +return a count of the number of bytes actually read or written. +A value of zero indicates the end of the data for this entry. +A negative value indicates an error, in which case the +.Fn archive_errno +and +.Fn archive_error_string +functions can be used to obtain more information. +.Sh ENVIRONMENT +There are character set conversions within the +.Xr archive_entry 3 +functions that are impacted by the currently-selected locale. +.Sh SEE ALSO +.Xr tar 1 , +.Xr archive_entry 3 , +.Xr archive_read 3 , +.Xr archive_util 3 , +.Xr archive_write 3 , +.Xr tar 5 +.Sh HISTORY +The +.Nm libarchive +library first appeared in +.Fx 5.3 . +.Sh AUTHORS +.An -nosplit +The +.Nm libarchive +library was written by +.An Tim Kientzle Aq kientzle@acm.org . +.Sh BUGS +Some archive formats support information that is not supported by +.Tn struct archive_entry . +Such information cannot be fully archived or restored using this library. +This includes, for example, comments, character sets, +or the arbitrary key/value pairs that can appear in +pax interchange format archives. +.Pp +Conversely, of course, not all of the information that can be +stored in an +.Tn struct archive_entry +is supported by all formats. +For example, cpio formats do not support nanosecond timestamps; +old tar formats do not support large device numbers. diff --git a/lib/libarchive/libarchive_internals.3 b/lib/libarchive/libarchive_internals.3 new file mode 100644 index 000000000..9a42b76d4 --- /dev/null +++ b/lib/libarchive/libarchive_internals.3 @@ -0,0 +1,366 @@ +.\" Copyright (c) 2003-2007 Tim Kientzle +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD: src/lib/libarchive/libarchive_internals.3,v 1.2 2007/12/30 04:58:22 kientzle Exp $ +.\" +.Dd April 16, 2007 +.Dt LIBARCHIVE 3 +.Os +.Sh NAME +.Nm libarchive_internals +.Nd description of libarchive internal interfaces +.Sh OVERVIEW +The +.Nm libarchive +library provides a flexible interface for reading and writing +streaming archive files such as tar and cpio. +Internally, it follows a modular layered design that should +make it easy to add new archive and compression formats. +.Sh GENERAL ARCHITECTURE +Externally, libarchive exposes most operations through an +opaque, object-style interface. +The +.Xr archive_entry 1 +objects store information about a single filesystem object. +The rest of the library provides facilities to write +.Xr archive_entry 1 +objects to archive files, +read them from archive files, +and write them to disk. +(There are plans to add a facility to read +.Xr archive_entry 1 +objects from disk as well.) +.Pp +The read and write APIs each have four layers: a public API +layer, a format layer that understands the archive file format, +a compression layer, and an I/O layer. +The I/O layer is completely exposed to clients who can replace +it entirely with their own functions. +.Pp +In order to provide as much consistency as possible for clients, +some public functions are virtualized. +Eventually, it should be possible for clients to open +an archive or disk writer, and then use a single set of +code to select and write entries, regardless of the target. +.Sh READ ARCHITECTURE +From the outside, clients use the +.Xr archive_read 3 +API to manipulate an +.Nm archive +object to read entries and bodies from an archive stream. +Internally, the +.Nm archive +object is cast to an +.Nm archive_read +object, which holds all read-specific data. +The API has four layers: +The lowest layer is the I/O layer. +This layer can be overridden by clients, but most clients use +the packaged I/O callbacks provided, for example, by +.Xr archive_read_open_memory 3 , +and +.Xr archive_read_open_fd 3 . +The compression layer calls the I/O layer to +read bytes and decompresses them for the format layer. +The format layer unpacks a stream of uncompressed bytes and +creates +.Nm archive_entry +objects from the incoming data. +The API layer tracks overall state +(for example, it prevents clients from reading data before reading a header) +and invokes the format and compression layer operations +through registered function pointers. +In particular, the API layer drives the format-detection process: +When opening the archive, it reads an initial block of data +and offers it to each registered compression handler. +The one with the highest bid is initialized with the first block. +Similarly, the format handlers are polled to see which handler +is the best for each archive. +(Prior to 2.4.0, the format bidders were invoked for each +entry, but this design hindered error recovery.) +.Ss I/O Layer and Client Callbacks +The read API goes to some lengths to be nice to clients. +As a result, there are few restrictions on the behavior of +the client callbacks. +.Pp +The client read callback is expected to provide a block +of data on each call. +A zero-length return does indicate end of file, but otherwise +blocks may be as small as one byte or as large as the entire file. +In particular, blocks may be of different sizes. +.Pp +The client skip callback returns the number of bytes actually +skipped, which may be much smaller than the skip requested. +The only requirement is that the skip not be larger. +In particular, clients are allowed to return zero for any +skip that they don't want to handle. +The skip callback must never be invoked with a negative value. +.Pp +Keep in mind that not all clients are reading from disk: +clients reading from networks may provide different-sized +blocks on every request and cannot skip at all; +advanced clients may use +.Xr mmap 2 +to read the entire file into memory at once and return the +entire file to libarchive as a single block; +other clients may begin asynchronous I/O operations for the +next block on each request. +.Ss Decompresssion Layer +The decompression layer not only handles decompression, +it also buffers data so that the format handlers see a +much nicer I/O model. +The decompression API is a two stage peek/consume model. +A read_ahead request specifies a minimum read amount; +the decompression layer must provide a pointer to at least +that much data. +If more data is immediately available, it should return more: +the format layer handles bulk data reads by asking for a minimum +of one byte and then copying as much data as is available. +.Pp +A subsequent call to the +.Fn consume +function advances the read pointer. +Note that data returned from a +.Fn read_ahead +call is guaranteed to remain in place until +the next call to +.Fn read_ahead . +Intervening calls to +.Fn consume +should not cause the data to move. +.Pp +Skip requests must always be handled exactly. +Decompression handlers that cannot seek forward should +not register a skip handler; +the API layer fills in a generic skip handler that reads and discards data. +.Pp +A decompression handler has a specific lifecycle: +.Bl -tag -compact -width indent +.It Registration/Configuration +When the client invokes the public support function, +the decompression handler invokes the internal +.Fn __archive_read_register_compression +function to provide bid and initialization functions. +This function returns +.Cm NULL +on error or else a pointer to a +.Cm struct decompressor_t . +This structure contains a +.Va void * config +slot that can be used for storing any customization information. +.It Bid +The bid function is invoked with a pointer and size of a block of data. +The decompressor can access its config data +through the +.Va decompressor +element of the +.Cm archive_read +object. +The bid function is otherwise stateless. +In particular, it must not perform any I/O operations. +.Pp +The value returned by the bid function indicates its suitability +for handling this data stream. +A bid of zero will ensure that this decompressor is never invoked. +Return zero if magic number checks fail. +Otherwise, your initial implementation should return the number of bits +actually checked. +For example, if you verify two full bytes and three bits of another +byte, bid 19. +Note that the initial block may be very short; +be careful to only inspect the data you are given. +(The current decompressors require two bytes for correct bidding.) +.It Initialize +The winning bidder will have its init function called. +This function should initialize the remaining slots of the +.Va struct decompressor_t +object pointed to by the +.Va decompressor +element of the +.Va archive_read +object. +In particular, it should allocate any working data it needs +in the +.Va data +slot of that structure. +The init function is called with the block of data that +was used for tasting. +At this point, the decompressor is responsible for all I/O +requests to the client callbacks. +The decompressor is free to read more data as and when +necessary. +.It Satisfy I/O requests +The format handler will invoke the +.Va read_ahead , +.Va consume , +and +.Va skip +functions as needed. +.It Finish +The finish method is called only once when the archive is closed. +It should release anything stored in the +.Va data +and +.Va config +slots of the +.Va decompressor +object. +It should not invoke the client close callback. +.El +.Ss Format Layer +The read formats have a similar lifecycle to the decompression handlers: +.Bl -tag -compact -width indent +.It Registration +Allocate your private data and initialize your pointers. +.It Bid +Formats bid by invoking the +.Fn read_ahead +decompression method but not calling the +.Fn consume +method. +This allows each bidder to look ahead in the input stream. +Bidders should not look further ahead than necessary, as long +look aheads put pressure on the decompression layer to buffer +lots of data. +Most formats only require a few hundred bytes of look ahead; +look aheads of a few kilobytes are reasonable. +(The ISO9660 reader sometimes looks ahead by 48k, which +should be considered an upper limit.) +.It Read header +The header read is usually the most complex part of any format. +There are a few strategies worth mentioning: +For formats such as tar or cpio, reading and parsing the header is +straightforward since headers alternate with data. +For formats that store all header data at the beginning of the file, +the first header read request may have to read all headers into +memory and store that data, sorted by the location of the file +data. +Subsequent header read requests will skip forward to the +beginning of the file data and return the corresponding header. +.It Read Data +The read data interface supports sparse files; this requires that +each call return a block of data specifying the file offset and +size. +This may require you to carefully track the location so that you +can return accurate file offsets for each read. +Remember that the decompressor will return as much data as it has. +Generally, you will want to request one byte, +examine the return value to see how much data is available, and +possibly trim that to the amount you can use. +You should invoke consume for each block just before you return it. +.It Skip All Data +The skip data call should skip over all file data and trailing padding. +This is called automatically by the API layer just before each +header read. +It is also called in response to the client calling the public +.Fn data_skip +function. +.It Cleanup +On cleanup, the format should release all of its allocated memory. +.El +.Ss API Layer +XXX to do XXX +.Sh WRITE ARCHITECTURE +The write API has a similar set of four layers: +an API layer, a format layer, a compression layer, and an I/O layer. +The registration here is much simpler because only +one format and one compression can be registered at a time. +.Ss I/O Layer and Client Callbacks +XXX To be written XXX +.Ss Compression Layer +XXX To be written XXX +.Ss Format Layer +XXX To be written XXX +.Ss API Layer +XXX To be written XXX +.Sh WRITE_DISK ARCHITECTURE +The write_disk API is intended to look just like the write API +to clients. +Since it does not handle multiple formats or compression, it +is not layered internally. +.Sh GENERAL SERVICES +The +.Nm archive_read , +.Nm archive_write , +and +.Nm archive_write_disk +objects all contain an initial +.Nm archive +object which provides common support for a set of standard services. +(Recall that ANSI/ISO C90 guarantees that you can cast freely between +a pointer to a structure and a pointer to the first element of that +structure.) +The +.Nm archive +object has a magic value that indicates which API this object +is associated with, +slots for storing error information, +and function pointers for virtualized API functions. +.Sh MISCELLANEOUS NOTES +Connecting existing archiving libraries into libarchive is generally +quite difficult. +In particular, many existing libraries strongly assume that you +are reading from a file; they seek forwards and backwards as necessary +to locate various pieces of information. +In contrast, libarchive never seeks backwards in its input, which +sometimes requires very different approaches. +.Pp +For example, libarchive's ISO9660 support operates very differently +from most ISO9660 readers. +The libarchive support utilizes a work-queue design that +keeps a list of known entries sorted by their location in the input. +Whenever libarchive's ISO9660 implementation is asked for the next +header, checks this list to find the next item on the disk. +Directories are parsed when they are encountered and new +items are added to the list. +This design relies heavily on the ISO9660 image being optimized so that +directories always occur earlier on the disk than the files they +describe. +.Pp +Depending on the specific format, such approaches may not be possible. +The ZIP format specification, for example, allows archivers to store +key information only at the end of the file. +In theory, it is possible to create ZIP archives that cannot +be read without seeking. +Fortunately, such archives are very rare, and libarchive can read +most ZIP archives, though it cannot always extract as much information +as a dedicated ZIP program. +.Sh SEE ALSO +.Xr archive 3 , +.Xr archive_entry 3 , +.Xr archive_read 3 , +.Xr archive_write 3 , +.Xr archive_write_disk 3 +.Sh HISTORY +The +.Nm libarchive +library first appeared in +.Fx 5.3 . +.Sh AUTHORS +.An -nosplit +The +.Nm libarchive +library was written by +.An Tim Kientzle Aq kientzle@acm.org . +.Sh BUGS diff --git a/lib/libarchive/minix_utils.c b/lib/libarchive/minix_utils.c new file mode 100644 index 000000000..86db21756 --- /dev/null +++ b/lib/libarchive/minix_utils.c @@ -0,0 +1,15 @@ +#include "minix_utils.h" + +u64_t lshift64(u64_t x, unsigned short b) +{ + u64_t r; + + if(b >= 32) { + r.lo = 0; + r.hi = x.lo << (b - 32); + }else { + r.lo = x.lo << b; + r.hi = (x.lo >> (32 - b)) | (x.hi << b); + } + return r; +} diff --git a/lib/libarchive/minix_utils.h b/lib/libarchive/minix_utils.h new file mode 100644 index 000000000..b198004f7 --- /dev/null +++ b/lib/libarchive/minix_utils.h @@ -0,0 +1,5 @@ +#ifndef MINIX_UTILS_H +#define MINIX_UTILS_H +#include +u64_t lshift64(u64_t x, unsigned short b); +#endif diff --git a/lib/libarchive/mtree.5 b/lib/libarchive/mtree.5 new file mode 100644 index 000000000..b6637d6f5 --- /dev/null +++ b/lib/libarchive/mtree.5 @@ -0,0 +1,269 @@ +.\" Copyright (c) 1989, 1990, 1993 +.\" The Regents of the University of California. All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" 4. Neither the name of the University nor the names of its contributors +.\" may be used to endorse or promote products derived from this software +.\" without specific prior written permission. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" From: @(#)mtree.8 8.2 (Berkeley) 12/11/93 +.\" $FreeBSD$ +.\" +.Dd August 20, 2007 +.Dt MTREE 5 +.Os +.Sh NAME +.Nm mtree +.Nd format of mtree dir hierarchy files +.Sh DESCRIPTION +The +.Nm +format is a textual format that describes a collection of filesystem objects. +Such files are typically used to create or verify directory hierarchies. +.Ss General Format +An +.Nm +file consists of a series of lines, each providing information +about a single filesystem object. +Leading whitespace is always ignored. +.Pp +When encoding file or pathnames, any backslash character or +character outside of the 95 printable ASCII characters must be +encoded as a a backslash followed by three +octal digits. +When reading mtree files, any appearance of a backslash +followed by three octal digits should be converted into the +corresponding character. +.Pp +Each line is interpreted independently as one of the following types: +.Bl -tag -width Cm +.It Signature +The first line of any mtree file must begin with +.Dq #mtree . +If a file contains any full path entries, the first line should +begin with +.Dq #mtree v2.0 , +otherwise, the first line should begin with +.Dq #mtree v1.0 . +.It Blank +Blank lines are ignored. +.It Comment +Lines beginning with +.Cm # +are ignored. +.It Special +Lines beginning with +.Cm / +are special commands that influence +the interpretation of later lines. +.It Relative +If the first whitespace-delimited word has no +.Cm / +characters, +it is the name of a file in the current directory. +Any relative entry that describes a directory changes the +current directory. +.It dot-dot +As a special case, a relative entry with the filename +.Pa .. +changes the current directory to the parent directory. +Options on dot-dot entries are always ignored. +.It Full +If the first whitespace-delimited word has a +.Cm / +character after +the first character, it is the pathname of a file relative to the +starting directory. +There can be multiple full entries describing the same file. +.El +.Pp +Some tools that process +.Nm +files may require that multiple lines describing the same file +occur consecutively. +It is not permitted for the same file to be mentioned using +both a relative and a full file specification. +.Ss Special commands +Two special commands are currently defined: +.Bl -tag -width Cm +.It Cm /set +This command defines default values for one or more keywords. +It is followed on the same line by one or more whitespace-separated +keyword definitions. +These definitions apply to all following files that do not specify +a value for that keyword. +.It Cm /unset +This command removes any default value set by a previous +.Cm /set +command. +It is followed on the same line by one or more keywords +separated by whitespace. +.El +.Ss Keywords +After the filename, a full or relative entry consists of zero +or more whitespace-separated keyword definitions. +Each such definition consists of a key from the following +list immediately followed by an '=' sign +and a value. +Software programs reading mtree files should warn about +unrecognized keywords. +.Pp +Currently supported keywords are as follows: +.Bl -tag -width Cm +.It Cm cksum +The checksum of the file using the default algorithm specified by +the +.Xr cksum 1 +utility. +.It Cm contents +The full pathname of a file that holds the contents of this file. +.It Cm flags +The file flags as a symbolic name. +See +.Xr chflags 1 +for information on these names. +If no flags are to be set the string +.Dq none +may be used to override the current default. +.It Cm gid +The file group as a numeric value. +.It Cm gname +The file group as a symbolic name. +.It Cm ignore +Ignore any file hierarchy below this file. +.It Cm link +The target of the symbolic link when type=link. +.It Cm md5 +The MD5 message digest of the file. +.It Cm md5digest +A synonym for +.Cm md5 . +.It Cm mode +The current file's permissions as a numeric (octal) or symbolic +value. +.It Cm nlink +The number of hard links the file is expected to have. +.It Cm nochange +Make sure this file or directory exists but otherwise ignore all attributes. +.It Cm ripemd160digest +The +.Tn RIPEMD160 +message digest of the file. +.It Cm rmd160 +A synonym for +.Cm ripemd160digest . +.It Cm rmd160digest +A synonym for +.Cm ripemd160digest . +.It Cm sha1 +The +.Tn FIPS +160-1 +.Pq Dq Tn SHA-1 +message digest of the file. +.It Cm sha1digest +A synonym for +.Cm sha1 . +.It Cm sha256 +The +.Tn FIPS +180-2 +.Pq Dq Tn SHA-256 +message digest of the file. +.It Cm sha256digest +A synonym for +.Cm sha256 . +.It Cm size +The size, in bytes, of the file. +.It Cm time +The last modification time of the file. +.It Cm type +The type of the file; may be set to any one of the following: +.Pp +.Bl -tag -width Cm -compact +.It Cm block +block special device +.It Cm char +character special device +.It Cm dir +directory +.It Cm fifo +fifo +.It Cm file +regular file +.It Cm link +symbolic link +.It Cm socket +socket +.El +.It Cm uid +The file owner as a numeric value. +.It Cm uname +The file owner as a symbolic name. +.El +.Pp +.Sh SEE ALSO +.Xr cksum 1 , +.Xr find 1 , +.Xr mtree 8 +.Sh BUGS +The +.Fx +implementation of mtree does not currently support +the +.Nm +2.0 +format. +The requirement for a +.Dq #mtree +signature line is new and not yet widely implemented. +.Sh HISTORY +The +.Nm +utility appeared in +.Bx 4.3 Reno . +The +.Tn MD5 +digest capability was added in +.Fx 2.1 , +in response to the widespread use of programs which can spoof +.Xr cksum 1 . +The +.Tn SHA-1 +and +.Tn RIPEMD160 +digests were added in +.Fx 4.0 , +as new attacks have demonstrated weaknesses in +.Tn MD5 . +The +.Tn SHA-256 +digest was added in +.Fx 6.0 . +Support for file flags was added in +.Fx 4.0 , +and mostly comes from +.Nx . +The +.Dq full +entry format was added by +.Nx . diff --git a/lib/libarchive/tar.5 b/lib/libarchive/tar.5 new file mode 100644 index 000000000..aafd535a1 --- /dev/null +++ b/lib/libarchive/tar.5 @@ -0,0 +1,831 @@ +.\" Copyright (c) 2003-2009 Tim Kientzle +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD: head/lib/libarchive/tar.5 201077 2009-12-28 01:50:23Z kientzle $ +.\" +.Dd December 27, 2009 +.Dt tar 5 +.Os +.Sh NAME +.Nm tar +.Nd format of tape archive files +.Sh DESCRIPTION +The +.Nm +archive format collects any number of files, directories, and other +file system objects (symbolic links, device nodes, etc.) into a single +stream of bytes. +The format was originally designed to be used with +tape drives that operate with fixed-size blocks, but is widely used as +a general packaging mechanism. +.Ss General Format +A +.Nm +archive consists of a series of 512-byte records. +Each file system object requires a header record which stores basic metadata +(pathname, owner, permissions, etc.) and zero or more records containing any +file data. +The end of the archive is indicated by two records consisting +entirely of zero bytes. +.Pp +For compatibility with tape drives that use fixed block sizes, +programs that read or write tar files always read or write a fixed +number of records with each I/O operation. +These +.Dq blocks +are always a multiple of the record size. +The maximum block size supported by early +implementations was 10240 bytes or 20 records. +This is still the default for most implementations +although block sizes of 1MiB (2048 records) or larger are +commonly used with modern high-speed tape drives. +(Note: the terms +.Dq block +and +.Dq record +here are not entirely standard; this document follows the +convention established by John Gilmore in documenting +.Nm pdtar . ) +.Ss Old-Style Archive Format +The original tar archive format has been extended many times to +include additional information that various implementors found +necessary. +This section describes the variant implemented by the tar command +included in +.At v7 , +which seems to be the earliest widely-used version of the tar program. +.Pp +The header record for an old-style +.Nm +archive consists of the following: +.Bd -literal -offset indent +struct header_old_tar { + char name[100]; + char mode[8]; + char uid[8]; + char gid[8]; + char size[12]; + char mtime[12]; + char checksum[8]; + char linkflag[1]; + char linkname[100]; + char pad[255]; +}; +.Ed +All unused bytes in the header record are filled with nulls. +.Bl -tag -width indent +.It Va name +Pathname, stored as a null-terminated string. +Early tar implementations only stored regular files (including +hardlinks to those files). +One common early convention used a trailing "/" character to indicate +a directory name, allowing directory permissions and owner information +to be archived and restored. +.It Va mode +File mode, stored as an octal number in ASCII. +.It Va uid , Va gid +User id and group id of owner, as octal numbers in ASCII. +.It Va size +Size of file, as octal number in ASCII. +For regular files only, this indicates the amount of data +that follows the header. +In particular, this field was ignored by early tar implementations +when extracting hardlinks. +Modern writers should always store a zero length for hardlink entries. +.It Va mtime +Modification time of file, as an octal number in ASCII. +This indicates the number of seconds since the start of the epoch, +00:00:00 UTC January 1, 1970. +Note that negative values should be avoided +here, as they are handled inconsistently. +.It Va checksum +Header checksum, stored as an octal number in ASCII. +To compute the checksum, set the checksum field to all spaces, +then sum all bytes in the header using unsigned arithmetic. +This field should be stored as six octal digits followed by a null and a space +character. +Note that many early implementations of tar used signed arithmetic +for the checksum field, which can cause interoperability problems +when transferring archives between systems. +Modern robust readers compute the checksum both ways and accept the +header if either computation matches. +.It Va linkflag , Va linkname +In order to preserve hardlinks and conserve tape, a file +with multiple links is only written to the archive the first +time it is encountered. +The next time it is encountered, the +.Va linkflag +is set to an ASCII +.Sq 1 +and the +.Va linkname +field holds the first name under which this file appears. +(Note that regular files have a null value in the +.Va linkflag +field.) +.El +.Pp +Early tar implementations varied in how they terminated these fields. +The tar command in +.At v7 +used the following conventions (this is also documented in early BSD manpages): +the pathname must be null-terminated; +the mode, uid, and gid fields must end in a space and a null byte; +the size and mtime fields must end in a space; +the checksum is terminated by a null and a space. +Early implementations filled the numeric fields with leading spaces. +This seems to have been common practice until the +.St -p1003.1-88 +standard was released. +For best portability, modern implementations should fill the numeric +fields with leading zeros. +.Ss Pre-POSIX Archives +An early draft of +.St -p1003.1-88 +served as the basis for John Gilmore's +.Nm pdtar +program and many system implementations from the late 1980s +and early 1990s. +These archives generally follow the POSIX ustar +format described below with the following variations: +.Bl -bullet -compact -width indent +.It +The magic value is +.Dq ustar\ \& +(note the following space). +The version field contains a space character followed by a null. +.It +The numeric fields are generally filled with leading spaces +(not leading zeros as recommended in the final standard). +.It +The prefix field is often not used, limiting pathnames to +the 100 characters of old-style archives. +.El +.Ss POSIX ustar Archives +.St -p1003.1-88 +defined a standard tar file format to be read and written +by compliant implementations of +.Xr tar 1 . +This format is often called the +.Dq ustar +format, after the magic value used +in the header. +(The name is an acronym for +.Dq Unix Standard TAR . ) +It extends the historic format with new fields: +.Bd -literal -offset indent +struct header_posix_ustar { + char name[100]; + char mode[8]; + char uid[8]; + char gid[8]; + char size[12]; + char mtime[12]; + char checksum[8]; + char typeflag[1]; + char linkname[100]; + char magic[6]; + char version[2]; + char uname[32]; + char gname[32]; + char devmajor[8]; + char devminor[8]; + char prefix[155]; + char pad[12]; +}; +.Ed +.Bl -tag -width indent +.It Va typeflag +Type of entry. +POSIX extended the earlier +.Va linkflag +field with several new type values: +.Bl -tag -width indent -compact +.It Dq 0 +Regular file. +NUL should be treated as a synonym, for compatibility purposes. +.It Dq 1 +Hard link. +.It Dq 2 +Symbolic link. +.It Dq 3 +Character device node. +.It Dq 4 +Block device node. +.It Dq 5 +Directory. +.It Dq 6 +FIFO node. +.It Dq 7 +Reserved. +.It Other +A POSIX-compliant implementation must treat any unrecognized typeflag value +as a regular file. +In particular, writers should ensure that all entries +have a valid filename so that they can be restored by readers that do not +support the corresponding extension. +Uppercase letters "A" through "Z" are reserved for custom extensions. +Note that sockets and whiteout entries are not archivable. +.El +It is worth noting that the +.Va size +field, in particular, has different meanings depending on the type. +For regular files, of course, it indicates the amount of data +following the header. +For directories, it may be used to indicate the total size of all +files in the directory, for use by operating systems that pre-allocate +directory space. +For all other types, it should be set to zero by writers and ignored +by readers. +.It Va magic +Contains the magic value +.Dq ustar +followed by a NUL byte to indicate that this is a POSIX standard archive. +Full compliance requires the uname and gname fields be properly set. +.It Va version +Version. +This should be +.Dq 00 +(two copies of the ASCII digit zero) for POSIX standard archives. +.It Va uname , Va gname +User and group names, as null-terminated ASCII strings. +These should be used in preference to the uid/gid values +when they are set and the corresponding names exist on +the system. +.It Va devmajor , Va devminor +Major and minor numbers for character device or block device entry. +.It Va name , Va prefix +If the pathname is too long to fit in the 100 bytes provided by the standard +format, it can be split at any +.Pa / +character with the first portion going into the prefix field. +If the prefix field is not empty, the reader will prepend +the prefix value and a +.Pa / +character to the regular name field to obtain the full pathname. +The standard does not require a trailing +.Pa / +character on directory names, though most implementations still +include this for compatibility reasons. +.El +.Pp +Note that all unused bytes must be set to +.Dv NUL . +.Pp +Field termination is specified slightly differently by POSIX +than by previous implementations. +The +.Va magic , +.Va uname , +and +.Va gname +fields must have a trailing +.Dv NUL . +The +.Va pathname , +.Va linkname , +and +.Va prefix +fields must have a trailing +.Dv NUL +unless they fill the entire field. +(In particular, it is possible to store a 256-character pathname if it +happens to have a +.Pa / +as the 156th character.) +POSIX requires numeric fields to be zero-padded in the front, and requires +them to be terminated with either space or +.Dv NUL +characters. +.Pp +Currently, most tar implementations comply with the ustar +format, occasionally extending it by adding new fields to the +blank area at the end of the header record. +.Ss Pax Interchange Format +There are many attributes that cannot be portably stored in a +POSIX ustar archive. +.St -p1003.1-2001 +defined a +.Dq pax interchange format +that uses two new types of entries to hold text-formatted +metadata that applies to following entries. +Note that a pax interchange format archive is a ustar archive in every +respect. +The new data is stored in ustar-compatible archive entries that use the +.Dq x +or +.Dq g +typeflag. +In particular, older implementations that do not fully support these +extensions will extract the metadata into regular files, where the +metadata can be examined as necessary. +.Pp +An entry in a pax interchange format archive consists of one or +two standard ustar entries, each with its own header and data. +The first optional entry stores the extended attributes +for the following entry. +This optional first entry has an "x" typeflag and a size field that +indicates the total size of the extended attributes. +The extended attributes themselves are stored as a series of text-format +lines encoded in the portable UTF-8 encoding. +Each line consists of a decimal number, a space, a key string, an equals +sign, a value string, and a new line. +The decimal number indicates the length of the entire line, including the +initial length field and the trailing newline. +An example of such a field is: +.Dl 25 ctime=1084839148.1212\en +Keys in all lowercase are standard keys. +Vendors can add their own keys by prefixing them with an all uppercase +vendor name and a period. +Note that, unlike the historic header, numeric values are stored using +decimal, not octal. +A description of some common keys follows: +.Bl -tag -width indent +.It Cm atime , Cm ctime , Cm mtime +File access, inode change, and modification times. +These fields can be negative or include a decimal point and a fractional value. +.It Cm uname , Cm uid , Cm gname , Cm gid +User name, group name, and numeric UID and GID values. +The user name and group name stored here are encoded in UTF8 +and can thus include non-ASCII characters. +The UID and GID fields can be of arbitrary length. +.It Cm linkpath +The full path of the linked-to file. +Note that this is encoded in UTF8 and can thus include non-ASCII characters. +.It Cm path +The full pathname of the entry. +Note that this is encoded in UTF8 and can thus include non-ASCII characters. +.It Cm realtime.* , Cm security.* +These keys are reserved and may be used for future standardization. +.It Cm size +The size of the file. +Note that there is no length limit on this field, allowing conforming +archives to store files much larger than the historic 8GB limit. +.It Cm SCHILY.* +Vendor-specific attributes used by Joerg Schilling's +.Nm star +implementation. +.It Cm SCHILY.acl.access , Cm SCHILY.acl.default +Stores the access and default ACLs as textual strings in a format +that is an extension of the format specified by POSIX.1e draft 17. +In particular, each user or group access specification can include a fourth +colon-separated field with the numeric UID or GID. +This allows ACLs to be restored on systems that may not have complete +user or group information available (such as when NIS/YP or LDAP services +are temporarily unavailable). +.It Cm SCHILY.devminor , Cm SCHILY.devmajor +The full minor and major numbers for device nodes. +.It Cm SCHILY.fflags +The file flags. +.It Cm SCHILY.realsize +The full size of the file on disk. +XXX explain? XXX +.It Cm SCHILY.dev, Cm SCHILY.ino , Cm SCHILY.nlinks +The device number, inode number, and link count for the entry. +In particular, note that a pax interchange format archive using Joerg +Schilling's +.Cm SCHILY.* +extensions can store all of the data from +.Va struct stat . +.It Cm LIBARCHIVE.xattr. Ns Ar namespace Ns . Ns Ar key +Libarchive stores POSIX.1e-style extended attributes using +keys of this form. +The +.Ar key +value is URL-encoded: +All non-ASCII characters and the two special characters +.Dq = +and +.Dq % +are encoded as +.Dq % +followed by two uppercase hexadecimal digits. +The value of this key is the extended attribute value +encoded in base 64. +XXX Detail the base-64 format here XXX +.It Cm VENDOR.* +XXX document other vendor-specific extensions XXX +.El +.Pp +Any values stored in an extended attribute override the corresponding +values in the regular tar header. +Note that compliant readers should ignore the regular fields when they +are overridden. +This is important, as existing archivers are known to store non-compliant +values in the standard header fields in this situation. +There are no limits on length for any of these fields. +In particular, numeric fields can be arbitrarily large. +All text fields are encoded in UTF8. +Compliant writers should store only portable 7-bit ASCII characters in +the standard ustar header and use extended +attributes whenever a text value contains non-ASCII characters. +.Pp +In addition to the +.Cm x +entry described above, the pax interchange format +also supports a +.Cm g +entry. +The +.Cm g +entry is identical in format, but specifies attributes that serve as +defaults for all subsequent archive entries. +The +.Cm g +entry is not widely used. +.Pp +Besides the new +.Cm x +and +.Cm g +entries, the pax interchange format has a few other minor variations +from the earlier ustar format. +The most troubling one is that hardlinks are permitted to have +data following them. +This allows readers to restore any hardlink to a file without +having to rewind the archive to find an earlier entry. +However, it creates complications for robust readers, as it is no longer +clear whether or not they should ignore the size field for hardlink entries. +.Ss GNU Tar Archives +The GNU tar program started with a pre-POSIX format similar to that +described earlier and has extended it using several different mechanisms: +It added new fields to the empty space in the header (some of which was later +used by POSIX for conflicting purposes); +it allowed the header to be continued over multiple records; +and it defined new entries that modify following entries +(similar in principle to the +.Cm x +entry described above, but each GNU special entry is single-purpose, +unlike the general-purpose +.Cm x +entry). +As a result, GNU tar archives are not POSIX compatible, although +more lenient POSIX-compliant readers can successfully extract most +GNU tar archives. +.Bd -literal -offset indent +struct header_gnu_tar { + char name[100]; + char mode[8]; + char uid[8]; + char gid[8]; + char size[12]; + char mtime[12]; + char checksum[8]; + char typeflag[1]; + char linkname[100]; + char magic[6]; + char version[2]; + char uname[32]; + char gname[32]; + char devmajor[8]; + char devminor[8]; + char atime[12]; + char ctime[12]; + char offset[12]; + char longnames[4]; + char unused[1]; + struct { + char offset[12]; + char numbytes[12]; + } sparse[4]; + char isextended[1]; + char realsize[12]; + char pad[17]; +}; +.Ed +.Bl -tag -width indent +.It Va typeflag +GNU tar uses the following special entry types, in addition to +those defined by POSIX: +.Bl -tag -width indent +.It "7" +GNU tar treats type "7" records identically to type "0" records, +except on one obscure RTOS where they are used to indicate the +pre-allocation of a contiguous file on disk. +.It "D" +This indicates a directory entry. +Unlike the POSIX-standard "5" +typeflag, the header is followed by data records listing the names +of files in this directory. +Each name is preceded by an ASCII "Y" +if the file is stored in this archive or "N" if the file is not +stored in this archive. +Each name is terminated with a null, and +an extra null marks the end of the name list. +The purpose of this +entry is to support incremental backups; a program restoring from +such an archive may wish to delete files on disk that did not exist +in the directory when the archive was made. +.Pp +Note that the "D" typeflag specifically violates POSIX, which requires +that unrecognized typeflags be restored as normal files. +In this case, restoring the "D" entry as a file could interfere +with subsequent creation of the like-named directory. +.It "K" +The data for this entry is a long linkname for the following regular entry. +.It "L" +The data for this entry is a long pathname for the following regular entry. +.It "M" +This is a continuation of the last file on the previous volume. +GNU multi-volume archives guarantee that each volume begins with a valid +entry header. +To ensure this, a file may be split, with part stored at the end of one volume, +and part stored at the beginning of the next volume. +The "M" typeflag indicates that this entry continues an existing file. +Such entries can only occur as the first or second entry +in an archive (the latter only if the first entry is a volume label). +The +.Va size +field specifies the size of this entry. +The +.Va offset +field at bytes 369-380 specifies the offset where this file fragment +begins. +The +.Va realsize +field specifies the total size of the file (which must equal +.Va size +plus +.Va offset ) . +When extracting, GNU tar checks that the header file name is the one it is +expecting, that the header offset is in the correct sequence, and that +the sum of offset and size is equal to realsize. +.It "N" +Type "N" records are no longer generated by GNU tar. +They contained a +list of files to be renamed or symlinked after extraction; this was +originally used to support long names. +The contents of this record +are a text description of the operations to be done, in the form +.Dq Rename %s to %s\en +or +.Dq Symlink %s to %s\en ; +in either case, both +filenames are escaped using K&R C syntax. +Due to security concerns, "N" records are now generally ignored +when reading archives. +.It "S" +This is a +.Dq sparse +regular file. +Sparse files are stored as a series of fragments. +The header contains a list of fragment offset/length pairs. +If more than four such entries are required, the header is +extended as necessary with +.Dq extra +header extensions (an older format that is no longer used), or +.Dq sparse +extensions. +.It "V" +The +.Va name +field should be interpreted as a tape/volume header name. +This entry should generally be ignored on extraction. +.El +.It Va magic +The magic field holds the five characters +.Dq ustar +followed by a space. +Note that POSIX ustar archives have a trailing null. +.It Va version +The version field holds a space character followed by a null. +Note that POSIX ustar archives use two copies of the ASCII digit +.Dq 0 . +.It Va atime , Va ctime +The time the file was last accessed and the time of +last change of file information, stored in octal as with +.Va mtime . +.It Va longnames +This field is apparently no longer used. +.It Sparse Va offset / Va numbytes +Each such structure specifies a single fragment of a sparse +file. +The two fields store values as octal numbers. +The fragments are each padded to a multiple of 512 bytes +in the archive. +On extraction, the list of fragments is collected from the +header (including any extension headers), and the data +is then read and written to the file at appropriate offsets. +.It Va isextended +If this is set to non-zero, the header will be followed by additional +.Dq sparse header +records. +Each such record contains information about as many as 21 additional +sparse blocks as shown here: +.Bd -literal -offset indent +struct gnu_sparse_header { + struct { + char offset[12]; + char numbytes[12]; + } sparse[21]; + char isextended[1]; + char padding[7]; +}; +.Ed +.It Va realsize +A binary representation of the file's complete size, with a much larger range +than the POSIX file size. +In particular, with +.Cm M +type files, the current entry is only a portion of the file. +In that case, the POSIX size field will indicate the size of this +entry; the +.Va realsize +field will indicate the total size of the file. +.El +.Ss GNU tar pax archives +GNU tar 1.14 (XXX check this XXX) and later will write +pax interchange format archives when you specify the +.Fl -posix +flag. +This format uses custom keywords to store sparse file information. +There have been three iterations of this support, referred to +as +.Dq 0.0 , +.Dq 0.1 , +and +.Dq 1.0 . +.Bl -tag -width indent +.It Cm GNU.sparse.numblocks , Cm GNU.sparse.offset , Cm GNU.sparse.numbytes , Cm GNU.sparse.size +The +.Dq 0.0 +format used an initial +.Cm GNU.sparse.numblocks +attribute to indicate the number of blocks in the file, a pair of +.Cm GNU.sparse.offset +and +.Cm GNU.sparse.numbytes +to indicate the offset and size of each block, +and a single +.Cm GNU.sparse.size +to indicate the full size of the file. +This is not the same as the size in the tar header because the +latter value does not include the size of any holes. +This format required that the order of attributes be preserved and +relied on readers accepting multiple appearances of the same attribute +names, which is not officially permitted by the standards. +.It Cm GNU.sparse.map +The +.Dq 0.1 +format used a single attribute that stored a comma-separated +list of decimal numbers. +Each pair of numbers indicated the offset and size, respectively, +of a block of data. +This does not work well if the archive is extracted by an archiver +that does not recognize this extension, since many pax implementations +simply discard unrecognized attributes. +.It Cm GNU.sparse.major , Cm GNU.sparse.minor , Cm GNU.sparse.name , Cm GNU.sparse.realsize +The +.Dq 1.0 +format stores the sparse block map in one or more 512-byte blocks +prepended to the file data in the entry body. +The pax attributes indicate the existence of this map +(via the +.Cm GNU.sparse.major +and +.Cm GNU.sparse.minor +fields) +and the full size of the file. +The +.Cm GNU.sparse.name +holds the true name of the file. +To avoid confusion, the name stored in the regular tar header +is a modified name so that extraction errors will be apparent +to users. +.El +.Ss Solaris Tar +XXX More Details Needed XXX +.Pp +Solaris tar (beginning with SunOS XXX 5.7 ?? XXX) supports an +.Dq extended +format that is fundamentally similar to pax interchange format, +with the following differences: +.Bl -bullet -compact -width indent +.It +Extended attributes are stored in an entry whose type is +.Cm X , +not +.Cm x , +as used by pax interchange format. +The detailed format of this entry appears to be the same +as detailed above for the +.Cm x +entry. +.It +An additional +.Cm A +entry is used to store an ACL for the following regular entry. +The body of this entry contains a seven-digit octal number +followed by a zero byte, followed by the +textual ACL description. +The octal value is the number of ACL entries +plus a constant that indicates the ACL type: 01000000 +for POSIX.1e ACLs and 03000000 for NFSv4 ACLs. +.El +.Ss AIX Tar +XXX More details needed XXX +.Ss Mac OS X Tar +The tar distributed with Apple's Mac OS X stores most regular files +as two separate entries in the tar archive. +The two entries have the same name except that the first +one has +.Dq ._ +added to the beginning of the name. +This first entry stores the +.Dq resource fork +with additional attributes for the file. +The Mac OS X +.Fn CopyFile +API is used to separate a file on disk into separate +resource and data streams and to reassemble those separate +streams when the file is restored to disk. +.Ss Other Extensions +One obvious extension to increase the size of files is to +eliminate the terminating characters from the various +numeric fields. +For example, the standard only allows the size field to contain +11 octal digits, reserving the twelfth byte for a trailing +NUL character. +Allowing 12 octal digits allows file sizes up to 64 GB. +.Pp +Another extension, utilized by GNU tar, star, and other newer +.Nm +implementations, permits binary numbers in the standard numeric fields. +This is flagged by setting the high bit of the first byte. +This permits 95-bit values for the length and time fields +and 63-bit values for the uid, gid, and device numbers. +GNU tar supports this extension for the +length, mtime, ctime, and atime fields. +Joerg Schilling's star program supports this extension for +all numeric fields. +Note that this extension is largely obsoleted by the extended attribute +record provided by the pax interchange format. +.Pp +Another early GNU extension allowed base-64 values rather than octal. +This extension was short-lived and is no longer supported by any +implementation. +.Sh SEE ALSO +.Xr ar 1 , +.Xr pax 1 , +.Xr tar 1 +.Sh STANDARDS +The +.Nm tar +utility is no longer a part of POSIX or the Single Unix Standard. +It last appeared in +.St -susv2 . +It has been supplanted in subsequent standards by +.Xr pax 1 . +The ustar format is currently part of the specification for the +.Xr pax 1 +utility. +The pax interchange file format is new with +.St -p1003.1-2001 . +.Sh HISTORY +A +.Nm tar +command appeared in Seventh Edition Unix, which was released in January, 1979. +It replaced the +.Nm tp +program from Fourth Edition Unix which in turn replaced the +.Nm tap +program from First Edition Unix. +John Gilmore's +.Nm pdtar +public-domain implementation (circa 1987) was highly influential +and formed the basis of +.Nm GNU tar +(circa 1988). +Joerg Shilling's +.Nm star +archiver is another open-source (GPL) archiver (originally developed +circa 1985) which features complete support for pax interchange +format. +.Pp +This documentation was written as part of the +.Nm libarchive +and +.Nm bsdtar +project by +.An Tim Kientzle Aq kientzle@FreeBSD.org .