程序包 | 说明 |
---|---|
org.apache.hadoop.conf |
Configuration of system parameters.
|
org.apache.hadoop.examples |
Hadoop example code.
|
org.apache.hadoop.examples.dancing |
This package is a distributed implementation of Knuth's dancing links
algorithm that can run under Hadoop.
|
org.apache.hadoop.examples.terasort |
This package consists of 3 map/reduce applications for Hadoop to
compete in the annual terabyte sort
competition.
|
org.apache.hadoop.fs |
An abstract file system API.
|
org.apache.hadoop.fs.ftp | |
org.apache.hadoop.fs.kfs |
A client for the Kosmos filesystem (KFS)
Introduction
This pages describes how to use Kosmos Filesystem
( KFS ) as a backing
store with Hadoop.
|
org.apache.hadoop.fs.s3 |
A distributed, block-based implementation of
FileSystem that uses Amazon S3
as a backing store. |
org.apache.hadoop.fs.s3native |
A distributed implementation of
FileSystem for reading and writing files on
Amazon S3. |
org.apache.hadoop.fs.shell | |
org.apache.hadoop.hdfs |
A distributed implementation of
FileSystem . |
org.apache.hadoop.hdfs.server.balancer | |
org.apache.hadoop.hdfs.server.datanode | |
org.apache.hadoop.hdfs.tools | |
org.apache.hadoop.hdfs.web | |
org.apache.hadoop.io |
Generic i/o code for use when reading and writing data to the network,
to databases, and to files.
|
org.apache.hadoop.io.compress | |
org.apache.hadoop.io.serializer |
This package provides a mechanism for using different serialization frameworks
in Hadoop.
|
org.apache.hadoop.mapred |
A software framework for easily writing applications which process vast
amounts of data (multi-terabyte data-sets) parallelly on large clusters
(thousands of nodes) built of commodity hardware in a reliable, fault-tolerant
manner.
|
org.apache.hadoop.mapred.join |
Given a set of sorted datasets keyed with the same class and yielding equal
partitions, it is possible to effect a join of those datasets prior to the map.
|
org.apache.hadoop.mapred.lib |
Library of generally useful mappers, reducers, and partitioners.
|
org.apache.hadoop.mapred.pipes |
Hadoop Pipes allows C++ code to use Hadoop DFS and map/reduce.
|
org.apache.hadoop.mapred.tools | |
org.apache.hadoop.mapreduce.lib.db | |
org.apache.hadoop.mapreduce.lib.partition | |
org.apache.hadoop.net |
Network-related classes.
|
org.apache.hadoop.streaming |
Hadoop Streaming is a utility which allows users to create and run
Map-Reduce jobs with any executables (e.g.
|
org.apache.hadoop.tools | |
org.apache.hadoop.tools.distcp2 | |
org.apache.hadoop.tools.rumen |
Rumen is a data extraction and analysis tool built for
Apache Hadoop.
|
org.apache.hadoop.typedbytes |
Typed bytes are sequences of bytes in which the first byte is a type code.
|
org.apache.hadoop.util |
Common utilities.
|
限定符和类型 | 类和说明 |
---|---|
class |
Configured
Base class for things that may be configured with a
Configuration . |
限定符和类型 | 类和说明 |
---|---|
class |
DBCountPageView
This is a demonstrative program, which uses DBInputFormat for reading
the input data from a database, and DBOutputFormat for writing the data
to the database.
|
class |
Grep |
class |
Join
This is the trivial map/reduce program that does absolutely nothing
other than use the framework to fragment and sort the input values.
|
class |
MultiFileWordCount
MultiFileWordCount is an example to demonstrate the usage of
MultiFileInputFormat.
|
class |
PiEstimator
A Map-reduce program to estimate the value of Pi
using quasi-Monte Carlo method.
|
class |
RandomTextWriter
This program uses map/reduce to just run a distributed job where there is
no interaction between the tasks and each task writes a large unsorted
random sequence of words.
|
class |
RandomWriter
This program uses map/reduce to just run a distributed job where there is
no interaction between the tasks and each task write a large unsorted
random binary sequence file of BytesWritable.
|
class |
SleepJob
Dummy class for testing MR framefork.
|
static class |
SleepJob.SleepInputFormat |
class |
Sort<K,V>
This is the trivial map/reduce program that does absolutely nothing
other than use the framework to fragment and sort the input values.
|
限定符和类型 | 类和说明 |
---|---|
class |
DistributedPentomino
Launch a distributed pentomino solver.
|
限定符和类型 | 类和说明 |
---|---|
class |
TeraGen
Generate the official terasort input data set.
|
class |
TeraSort
Generates the sampled split points, launches the job, and waits for it to
finish.
|
class |
TeraValidate
Generate 1 mapper per a file that checks to make sure the keys
are sorted within each file.
|
限定符和类型 | 类和说明 |
---|---|
class |
ChecksumFileSystem
Abstract Checksumed FileSystem.
|
class |
FileSystem
An abstract base class for a fairly generic filesystem.
|
class |
FilterFileSystem
A
FilterFileSystem contains
some other file system, which it uses as
its basic file system, possibly transforming
the data along the way or providing additional
functionality. |
class |
FsShell
Provide command line access to a FileSystem.
|
class |
HarFileSystem
This is an implementation of the Hadoop Archive
Filesystem.
|
class |
InMemoryFileSystem
已过时。
|
class |
LocalFileSystem
Implement the FileSystem API for the checksumed local filesystem.
|
class |
RawLocalFileSystem
Implement the FileSystem API for the raw local filesystem.
|
class |
Trash
Provides a trash feature.
|
限定符和类型 | 类和说明 |
---|---|
class |
FTPFileSystem
A
FileSystem backed by an FTP client provided by Apache Commons Net. |
限定符和类型 | 类和说明 |
---|---|
class |
KosmosFileSystem
A FileSystem backed by KFS.
|
限定符和类型 | 类和说明 |
---|---|
class |
MigrationTool
This class is a tool for migrating data from an older to a newer version
of an S3 filesystem.
|
class |
S3FileSystem
A block-based
FileSystem backed by
Amazon S3. |
限定符和类型 | 类和说明 |
---|---|
class |
NativeS3FileSystem
A
FileSystem for reading and writing files stored on
Amazon S3. |
限定符和类型 | 类和说明 |
---|---|
class |
Command
An abstract class for the execution of a file system command
|
class |
Count
Count the number of directories, files, bytes, quota, and remaining quota.
|
限定符和类型 | 类和说明 |
---|---|
class |
ChecksumDistributedFileSystem
An implementation of ChecksumFileSystem over DistributedFileSystem.
|
class |
DistributedFileSystem
Implementation of the abstract FileSystem for the DFS system.
|
class |
HftpFileSystem
An implementation of a protocol for accessing filesystems over HTTP.
|
class |
HsftpFileSystem
An implementation of a protocol for accessing filesystems over HTTPS.
|
限定符和类型 | 类和说明 |
---|---|
class |
Balancer
The balancer is a tool that balances disk space usage on an HDFS cluster
when some datanodes become full or when new empty nodes join the cluster.
|
限定符和类型 | 类和说明 |
---|---|
class |
DataNode
DataNode is a class (and program) that stores a set of
blocks for a DFS deployment.
|
限定符和类型 | 类和说明 |
---|---|
class |
DFSAdmin
This class provides some DFS administrative access.
|
class |
DFSck
This class provides rudimentary checking of DFS volumes for errors and
sub-optimal conditions.
|
限定符和类型 | 类和说明 |
---|---|
class |
WebHdfsFileSystem
A FileSystem for HDFS over the web.
|
限定符和类型 | 类和说明 |
---|---|
class |
AbstractMapWritable
Abstract base class for MapWritable and SortedMapWritable
Unlike org.apache.nutch.crawl.MapWritable, this class allows creation of
MapWritable<Writable, MapWritable> so the CLASS_TO_ID and ID_TO_CLASS
maps travel with the class instead of being static.
|
class |
GenericWritable
A wrapper for Writable instances.
|
class |
MapWritable
A Writable Map.
|
class |
ObjectWritable
A polymorphic Writable that writes an instance with it's class name.
|
class |
SortedMapWritable
A Writable SortedMap.
|
限定符和类型 | 类和说明 |
---|---|
class |
DefaultCodec |
class |
GzipCodec
This class creates gzip compressors/decompressors.
|
class |
SnappyCodec
This class creates snappy compressors/decompressors.
|
限定符和类型 | 类和说明 |
---|---|
class |
SerializationFactory
A factory for
Serialization s. |
class |
WritableSerialization
A
Serialization for Writable s that delegates to
Writable.write(java.io.DataOutput) and
Writable.readFields(java.io.DataInput) . |
限定符和类型 | 接口和说明 |
---|---|
static interface |
SequenceFileInputFilter.Filter
filter interface
|
限定符和类型 | 类和说明 |
---|---|
class |
DefaultTaskController
The default implementation for controlling tasks.
|
class |
JobClient
JobClient is the primary interface for the user-job to interact
with the JobTracker . |
static class |
SequenceFileInputFilter.FilterBase
base class for Filters
|
static class |
SequenceFileInputFilter.MD5Filter
This class returns a set of records by examing the MD5 digest of its
key against a filtering frequency f.
|
static class |
SequenceFileInputFilter.PercentFilter
This class returns a percentage of records
The percentage is determined by a filtering frequency f using
the criteria record# % f == 0.
|
static class |
SequenceFileInputFilter.RegexFilter
Records filter by matching key to regex
|
class |
Task
Base class for tasks.
|
class |
TaskController
Controls initialization, finalization and clean up of tasks, and
also the launching and killing of task JVMs.
|
限定符和类型 | 类和说明 |
---|---|
class |
CompositeRecordReader<K extends WritableComparable,V extends Writable,X extends Writable>
A RecordReader that can effect joins of RecordReaders sharing a common key
type and partitioning.
|
class |
InnerJoinRecordReader<K extends WritableComparable>
Full inner join.
|
class |
JoinRecordReader<K extends WritableComparable>
Base class for Composite joins returning Tuples of arbitrary Writables.
|
class |
MultiFilterRecordReader<K extends WritableComparable,V extends Writable>
Base class for Composite join returning values derived from multiple
sources, but generally not tuples.
|
class |
OuterJoinRecordReader<K extends WritableComparable>
Full outer join.
|
class |
OverrideRecordReader<K extends WritableComparable,V extends Writable>
Prefer the "rightmost" data source for this key.
|
限定符和类型 | 类和说明 |
---|---|
class |
InputSampler<K,V>
Utility for collecting samples and writing a partition file for
TotalOrderPartitioner . |
限定符和类型 | 类和说明 |
---|---|
class |
Submitter
The main entry point and job submitter.
|
限定符和类型 | 类和说明 |
---|---|
class |
MRAdmin
Administrative access to Hadoop Map-Reduce.
|
限定符和类型 | 类和说明 |
---|---|
class |
DataDrivenDBInputFormat<T extends DBWritable>
A InputFormat that reads input data from an SQL table.
|
class |
DBInputFormat<T extends DBWritable>
A InputFormat that reads input data from an SQL table.
|
class |
OracleDataDrivenDBInputFormat<T extends DBWritable>
A InputFormat that reads input data from an SQL table in an Oracle db.
|
限定符和类型 | 类和说明 |
---|---|
class |
BinaryPartitioner<V>
Partition
BinaryComparable keys using a configurable part of
the bytes array returned by BinaryComparable.getBytes() . |
class |
KeyFieldBasedComparator<K,V>
This comparator implementation provides a subset of the features provided
by the Unix/GNU Sort.
|
class |
KeyFieldBasedPartitioner<K2,V2>
Defines a way to partition keys based on certain key fields (also see
KeyFieldBasedComparator . |
class |
TotalOrderPartitioner<K extends WritableComparable<?>,V>
Partitioner effecting a total order by reading split points from
an externally generated source.
|
限定符和类型 | 类和说明 |
---|---|
class |
ScriptBasedMapping
This class implements the
DNSToSwitchMapping interface using a
script configured via topology.script.file.name . |
class |
SocksSocketFactory
Specialized SocketFactory to create sockets with a SOCKS proxy
|
限定符和类型 | 类和说明 |
---|---|
class |
DumpTypedBytes
Utility program that fetches all files that match a given pattern and dumps
their content to stdout as typed bytes.
|
class |
LoadTypedBytes
Utility program that reads typed bytes from standard input and stores them in
a sequence file for which the path is given as an argument.
|
class |
StreamJob
All the client-side work happens here.
|
限定符和类型 | 类和说明 |
---|---|
class |
DistCh
A Map-reduce program to recursively change files properties
such as owner, group and permission.
|
class |
DistCp
A Map-reduce program to recursively copy directories between
different file-systems.
|
class |
HadoopArchives
a archive creation utility.
|
static class |
Logalyzer.LogComparator
A WritableComparator optimized for UTF8 keys of the logs.
|
限定符和类型 | 类和说明 |
---|---|
class |
CopyListing
The CopyListing abstraction is responsible for how the list of
sources and targets is constructed, for DistCp's copy function.
|
class |
FileBasedCopyListing
FileBasedCopyListing implements the CopyListing interface,
to create the copy-listing for DistCp,
by iterating over all source paths mentioned in a specified input-file.
|
class |
GlobbedCopyListing
GlobbedCopyListing implements the CopyListing interface, to create the copy
listing-file by "globbing" all specified source paths (wild-cards and all.)
|
class |
SimpleCopyListing
The SimpleCopyListing is responsible for making the exhaustive list of
all files/directories under its specified list of input-paths.
|
限定符和类型 | 类和说明 |
---|---|
class |
HadoopLogsAnalyzer
已过时。
|
class |
TraceBuilder
The main driver of the Rumen Parser.
|
限定符和类型 | 类和说明 |
---|---|
class |
TypedBytesWritableInput
Provides functionality for reading typed bytes as Writable objects.
|
限定符和类型 | 接口和说明 |
---|---|
interface |
Tool
A tool interface that supports handling of generic command-line options.
|
限定符和类型 | 类和说明 |
---|---|
class |
LinuxMemoryCalculatorPlugin
已过时。
Use
LinuxResourceCalculatorPlugin
instead |
class |
LinuxResourceCalculatorPlugin
Plugin to calculate resource information on Linux systems.
|
class |
MemoryCalculatorPlugin
已过时。
Use
ResourceCalculatorPlugin
instead |
class |
ResourceCalculatorPlugin
Plugin to calculate resource information on the system.
|
Copyright © 2009 The Apache Software Foundation