Type: | Package |
Title: | Interface to 'TensorFlow IO' |
Version: | 0.4.1 |
Description: | Interface to 'TensorFlow IO', Datasets and filesystem extensions maintained by 'TensorFlow SIG-IO' https://github.com/tensorflow/community/blob/master/sigs/io/CHARTER.md. |
License: | Apache License 2.0 |
URL: | https://github.com/tensorflow/io |
BugReports: | https://github.com/tensorflow/io/issues |
SystemRequirements: | TensorFlow >= 1.13.0 (https://www.tensorflow.org/) and TensorFlow IO >= 0.4.0 (https://github.com/tensorflow/io) |
Encoding: | UTF-8 |
LazyData: | true |
Depends: | R (≥ 3.1) |
Imports: | reticulate (≥ 1.10), tensorflow (≥ 1.9), tfdatasets (≥ 1.9), forge, magrittr |
RoxygenNote: | 7.0.2 |
Suggests: | testthat, knitr |
VignetteBuilder: | knitr |
NeedsCompilation: | no |
Packaged: | 2019-12-19 14:00:25 UTC; yuan.tang |
Author: | TensorFlow IO Contributors [aut, cph] (Full list of contributors can be
found at <https://github.com/tensorflow/io/graphs/contributors>),
Yuan Tang |
Maintainer: | Yuan Tang <terrytangyuan@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2019-12-19 16:50:02 UTC |
Pipe operator
Description
See %>%
for more details.
Usage
lhs %>% rhs
Creates a ArrowFeatherDataset
.
Description
An Arrow Dataset for reading record batches from Arrow feather files. Feather is a light-weight columnar format ideal for simple writing of Pandas DataFrames.
Usage
arrow_feather_dataset(filenames, columns, output_types, output_shapes = NULL)
Arguments
filenames |
A |
columns |
A list of column indices to be used in the Dataset. |
output_types |
Tensor dtypes of the output tensors. |
output_shapes |
TensorShapes of the output tensors or |
Examples
## Not run:
dataset <- arrow_feather_dataset(
list('/path/to/a.feather', '/path/to/b.feather'),
columns = reticulate::tuple(0L, 1L),
output_types = reticulate::tuple(tf$int32, tf$float32),
output_shapes = reticulate::tuple(list(), list())) %>%
dataset_repeat(1)
sess <- tf$Session()
iterator <- make_iterator_one_shot(dataset)
next_batch <- iterator_get_next(iterator)
until_out_of_range({
batch <- sess$run(next_batch)
print(batch)
})
## End(Not run)
Creates a ArrowStreamDataset
.
Description
An Arrow Dataset for reading record batches from an input stream. Currently supported input streams are a socket client or stdin.
Usage
arrow_stream_dataset(host, columns, output_types, output_shapes = NULL)
Arguments
host |
A |
columns |
A list of column indices to be used in the Dataset. |
output_types |
Tensor dtypes of the output tensors. |
output_shapes |
TensorShapes of the output tensors or |
Examples
## Not run:
dataset <- arrow_stream_dataset(
host,
columns = reticulate::tuple(0L, 1L),
output_types = reticulate::tuple(tf$int32, tf$float32),
output_shapes = reticulate::tuple(list(), list())) %>%
dataset_repeat(1)
sess <- tf$Session()
iterator <- make_iterator_one_shot(dataset)
next_batch <- iterator_get_next(iterator)
until_out_of_range({
batch <- sess$run(next_batch)
print(batch)
})
## End(Not run)
Create an Arrow Dataset from the given Arrow schema.
Description
Infer output types and shapes from the given Arrow schema and create an Arrow Dataset.
Usage
from_schema(object, ...)
Arguments
object |
An R object. |
... |
Optional arguments passed on to implementing methods. |
Create an Arrow Dataset for reading record batches from Arrow feather files, inferring output types and shapes from the given Arrow schema.
Description
Create an Arrow Dataset for reading record batches from Arrow feather files, inferring output types and shapes from the given Arrow schema.
Usage
## S3 method for class 'arrow_feather_dataset'
from_schema(object, schema, columns = NULL, host = NULL, filenames = NULL, ...)
Arguments
object |
An R object. |
schema |
Arrow schema defining the record batch data in the stream. |
columns |
A list of column indices to be used in the Dataset. |
host |
Not used. |
filenames |
A |
... |
Optional arguments passed on to implementing methods. |
Create an Arrow Dataset from an input stream, inferring output types and shapes from the given Arrow schema.
Description
Create an Arrow Dataset from an input stream, inferring output types and shapes from the given Arrow schema.
Usage
## S3 method for class 'arrow_stream_dataset'
from_schema(object, schema, columns = NULL, host = NULL, filenames = NULL, ...)
Arguments
object |
An R object. |
schema |
Arrow schema defining the record batch data in the stream. |
columns |
A list of column indices to be used in the Dataset. |
host |
A |
filenames |
Not used. |
... |
Optional arguments passed on to implementing methods. |
Create a IgniteDataset
.
Description
Apache Ignite is a memory-centric distributed database, caching, and processing platform for transactional, analytical, and streaming workloads, delivering in-memory speeds at petabyte scale. This contrib package contains an integration between Apache Ignite and TensorFlow. The integration is based on tf.data from TensorFlow side and Binary Client Protocol from Apache Ignite side. It allows to use Apache Ignite as a datasource for neural network training, inference and all other computations supported by TensorFlow. Ignite Dataset is based on Apache Ignite Binary Client Protocol.
Usage
ignite_dataset(
cache_name,
host = "localhost",
port = 10800,
local = FALSE,
part = -1,
page_size = 100,
username = NULL,
password = NULL,
certfile = NULL,
keyfile = NULL,
cert_password = NULL
)
Arguments
cache_name |
Cache name to be used as datasource. |
host |
Apache Ignite Thin Client host to be connected. |
port |
Apache Ignite Thin Client port to be connected. |
local |
Local flag that defines to query only local data. |
part |
Number of partitions to be queried. |
page_size |
Apache Ignite Thin Client page size. |
username |
Apache Ignite Thin Client authentication username. |
password |
Apache Ignite Thin Client authentication password. |
certfile |
File in PEM format containing the certificate as well as any number of CA certificates needed to establish the certificate's authenticity. |
keyfile |
File containing the private key (otherwise the private key will be taken from certfile as well). |
cert_password |
Password to be used if the private key is encrypted and a password is necessary. |
Examples
## Not run:
dataset <- ignite_dataset(
cache_name = "SQL_PUBLIC_TEST_CACHE", port = 10800) %>%
dataset_repeat(1)
sess <- tf$Session()
iterator <- make_iterator_one_shot(dataset)
next_batch <- iterator_get_next(iterator)
until_out_of_range({
batch <- sess$run(next_batch)
print(batch)
})
## End(Not run)
Creates a KafkaDataset
.
Description
Creates a KafkaDataset
.
Usage
kafka_dataset(
topics,
servers = "localhost",
group = "",
eof = FALSE,
timeout = 1000
)
Arguments
topics |
A |
servers |
A list of bootstrap servers. |
group |
The consumer group id. |
eof |
If True, the kafka reader will stop on EOF. |
timeout |
The timeout value for the Kafka Consumer to wait (in millisecond). |
Examples
## Not run:
dataset <- kafka_dataset(
topics = list("test:0:0:4"), group = "test", eof = TRUE) %>%
dataset_repeat(1)
sess <- tf$Session()
iterator <- make_iterator_one_shot(dataset)
next_batch <- iterator_get_next(iterator)
until_out_of_range({
batch <- sess$run(next_batch)
print(batch)
})
## End(Not run)
Creates a KinesisDataset
.
Description
Kinesis is a managed service provided by AWS for data streaming.
This dataset reads messages from Kinesis with each message presented
as a tf.string
.
Usage
kinesis_dataset(stream, shard = "", read_indefinitely = TRUE, interval = 1e+05)
Arguments
stream |
A |
shard |
A |
read_indefinitely |
If |
interval |
The interval for the Kinesis Client to wait before it tries to get records again (in millisecond). |
Create a LMDBDataset
.
Description
This function allows a user to read data from a LMDB file. A lmdb file consists of (key value) pairs sequentially.
Usage
lmdb_dataset(filenames)
Arguments
filenames |
A |
Examples
## Not run:
dataset <- sequence_file_dataset("testdata/data.mdb") %>%
dataset_repeat(1)
sess <- tf$Session()
iterator <- make_iterator_one_shot(dataset)
next_batch <- iterator_get_next(iterator)
until_out_of_range({
batch <- sess$run(next_batch)
print(batch)
})
## End(Not run)
Create a Dataset from LibSVM files.
Description
Create a Dataset from LibSVM files.
Usage
make_libsvm_dataset(
file_names,
num_features,
dtype = NULL,
label_dtype = NULL,
batch_size = 1,
compression_type = "",
buffer_size = NULL,
num_parallel_parser_calls = NULL,
drop_final_batch = FALSE,
prefetch_buffer_size = 0
)
Arguments
file_names |
A |
num_features |
The number of features. |
dtype |
The type of the output feature tensor. Default to |
label_dtype |
The type of the output label tensor. Default to
|
batch_size |
An integer representing the number of records to combine in a single batch, default 1. |
compression_type |
A |
buffer_size |
A |
num_parallel_parser_calls |
Number of parallel records to parse in parallel. Defaults to an automatic selection. |
drop_final_batch |
Whether the last batch should be dropped in case its
size is smaller than |
prefetch_buffer_size |
An integer specifying the number of feature batches to prefetch for performance improvement. Defaults to auto-tune. Set to 0 to disable prefetching. |
Creates a MNISTImageDataset
.
Description
This creates a dataset for MNIST images.
Usage
mnist_image_dataset(filenames, compression_type = NULL)
Arguments
filenames |
A |
compression_type |
A |
Creates a MNISTLabelDataset
.
Description
This creates a dataset for MNIST labels.
Usage
mnist_label_dataset(filenames, compression_type = NULL)
Arguments
filenames |
A |
compression_type |
A |
Create a ParquetDataset
.
Description
This allows a user to read data from a parquet file.
Usage
parquet_dataset(filenames, columns, output_types)
Arguments
filenames |
A 0-D or 1-D |
columns |
A 0-D or 1-D |
output_types |
A tuple of |
Examples
## Not run:
dtypes <- tf$python$framework$dtypes
output_types <- reticulate::tuple(
dtypes$bool, dtypes$int32, dtypes$int64, dtypes$float32, dtypes$float64)
dataset <- parquet_dataset(
filenames = list("testdata/parquet_cpp_example.parquet"),
columns = list(0, 1, 2, 4, 5),
output_types = output_types) %>%
dataset_repeat(2)
sess <- tf$Session()
iterator <- make_iterator_one_shot(dataset)
next_batch <- iterator_get_next(iterator)
until_out_of_range({
batch <- sess$run(next_batch)
print(batch)
})
## End(Not run)
Creates a PubSubDataset
.
Description
This creates a dataset for consuming PubSub messages.
Usage
pubsub_dataset(subscriptions, server = NULL, eof = FALSE, timeout = 1000)
Arguments
subscriptions |
A |
server |
The pubsub server. |
eof |
If True, the pubsub reader will stop on EOF. |
timeout |
The timeout value for the PubSub to wait (in millisecond). |
Objects exported from other packages
Description
These objects are imported from other packages. Follow the links below to see their documentation.
- tensorflow
- tfdatasets
dataset_batch
,dataset_cache
,dataset_concatenate
,dataset_filter
,dataset_flat_map
,dataset_interleave
,dataset_map
,dataset_map_and_batch
,dataset_padded_batch
,dataset_prefetch
,dataset_prefetch
,dataset_prefetch_to_device
,dataset_prepare
,dataset_repeat
,dataset_shard
,dataset_shuffle
,dataset_shuffle_and_repeat
,dataset_skip
,dataset_take
,iterator_get_next
,iterator_initializer
,iterator_make_initializer
,iterator_string_handle
,make_iterator_from_string_handle
,make_iterator_from_structure
,make_iterator_initializable
,make_iterator_one_shot
,next_batch
,out_of_range_handler
,output_types
,output_types
,until_out_of_range
,with_dataset
Create a SequenceFileDataset
.
Description
This function allows a user to read data from a hadoop sequence
file. A sequence file consists of (key value) pairs sequentially. At
the moment, org.apache.hadoop.io.Text
is the only serialization type
being supported, and there is no compression support.
Usage
sequence_file_dataset(filenames)
Arguments
filenames |
A |
Examples
## Not run:
dataset <- sequence_file_dataset("testdata/string.seq") %>%
dataset_repeat(1)
sess <- tf$Session()
iterator <- make_iterator_one_shot(dataset)
next_batch <- iterator_get_next(iterator)
until_out_of_range({
batch <- sess$run(next_batch)
print(batch)
})
## End(Not run)
TensorFlow IO API for R
Description
This library provides an R interface to the TensorFlow IO API that provides datasets and filesystem extensions maintained by SIG-IO.
Create a TIFFDataset
.
Description
A TIFF Image File Dataset that reads the TIFF file.
Usage
tiff_dataset(filenames)
Arguments
filenames |
A |
Examples
## Not run:
dataset <- tiff_dataset(
filenames = list("testdata/small.tiff")) %>%
dataset_repeat(1)
sess <- tf$Session()
iterator <- make_iterator_one_shot(dataset)
next_batch <- iterator_get_next(iterator)
until_out_of_range({
batch <- sess$run(next_batch)
print(batch)
})
## End(Not run)
Create a VideoDataset
that reads the video file.
Description
This allows a user to read data from a video file with ffmpeg. The output of VideoDataset is a sequence of (height, weight, 3) tensor in rgb24 format.
Usage
video_dataset(filenames)
Arguments
filenames |
A |
Examples
## Not run:
dataset <- video_dataset(
filenames = list("testdata/small.mp4")) %>%
dataset_repeat(2)
sess <- tf$Session()
iterator <- make_iterator_one_shot(dataset)
next_batch <- iterator_get_next(iterator)
until_out_of_range({
batch <- sess$run(next_batch)
print(batch)
})
## End(Not run)
Create a WebPDataset
.
Description
A WebP Image File Dataset that reads the WebP file.
Usage
webp_dataset(filenames)
Arguments
filenames |
A |
Examples
## Not run:
dataset <- webp_dataset(
filenames = list("testdata/sample.webp")) %>%
dataset_repeat(1)
sess <- tf$Session()
iterator <- make_iterator_one_shot(dataset)
next_batch <- iterator_get_next(iterator)
until_out_of_range({
batch <- sess$run(next_batch)
print(batch)
})
## End(Not run)