I have been asked to review the book “R Object-oriented Programming” by Kelly Black, edited by Packt publishing (£14.45 for the E-Book, £27.99 for Print+E-Book).
The scope of the book
is “to provide a resource for programming using the R language” and therefore
it can be seen as a good and practical introduction to all the most commonly
used part of R. The first 2 chapters deal with data type and data organization
in R. They basically quickly review how to handle each type of data (such as integers,
doubles) and how to organize them into R objects. The third chapter deals with reading data from files and save them.
This chapter gives a pretty good introduction into reading and writing every
sort of data, even binaries, and from a variety of sources, including from the
web. Chapter 4 provides an introduction to R commands to generate random
numbers, in particular it gives a thorough overview of the sample command. Chapters 5 and 6 give a good background into the
use of R to manipulate string and time variables. Of particular interest throughout
the book is the handling of data gathered from public source on the web. For
these particular data skills in string manipulation become crucial both for
handling web addresses and also to extract the actual data from the information
returned by the server. For this reason I think this book does a good job in introducing
these important aspect of the R language.
Chapter 7 introduces some basic programming concepts, such
as if statements and loops. Chapters 8 and 9 provide a complete overview of the
S3 and S4 classes and finally chapters 10 and 11 are two hands on examples
on how to put together all the concepts learned in the book to solve very
practical problems. In these example the reader will be guided towards the
creation of powerful R programs to grade student and perform a Monte Carlo
simulation.
The book is written in a very practical form, meaning that
not much time is wasted explaining each function in details, readers can browse
the help pages of each function for more details. This means that probably this
book is not for newbies to programming languages. Most of the learning is done
by exploring the lines of code provided and for this reason I think the best
readers would be people familiar with a programming language, even though I do
not think that readers necessarily needs some familiarity with R. However as
stated on the website, the target for this book are beginners who wants to
become more “fluent” with the language.
Overall, I think this book does a good of providing the
reader with a strong and neat introduction to all the bits of coding required
to become more comfortable writing advance scripts. For example, at the end of chapter
2 the author discuss the use of the apply
set of commands. These are crucial milestone to be learned for every individual
who wants to switch from a mundane use of R to a more advanced and rigorous use
of the language. In my personal experience when I began using R I would often
create very long script using lots of loops and if statements, which tends to
greatly decrease the execution speed. As soon as I learned to master the apply set of commands I was able to reduce
my code and crucially I was also able to substantially increase its executing
speed. Personally I would have loved to have access to such a book back then! The
use of web sources for data manipulation is also a very nice addition that as
far as I know is not common in other introductory texts. Nowadays gathering
data from the web has become the norm and therefore I think it is important to
provide beginners with tools to handle these type of data.
The strength of this book however is in chapters 8 and 9,
which provide an extensive introduction to the use of the classes S3 and S4. I think these two chapters alone would justify the price for buying it. As far as I know these concepts are generally not treated
with the right attention in books for beginners. They may explain you that when
you load a package then the functions you normally use, such as plot, may change their function and
options. However, I never found an introductory book that provides such as exhaustive
explanation of how to fully control these classes to create advance programs. Of
particular interest are also the two examples provided in Chapters 10 and 11.
These are practical exercises that put together all the concepts learned in the
previous chapters with the purpose of creating R programs that can be easily
implemented and share. Chapter 10 for example describe a neat and powerful way to
create a new R program to grade students. In this chapter the reader will use
all the basic programming concept learned during the course of the book and
he/she will put them together for creating an R program to import grades from
csv files, manipulate them and create summary statistics and plot.
In conclusion, I see a variety of uses for this book.
Clearly it is targeted to post beginners who need a short way to unlock the
full power of R for their daily statistical routines. However, this book does
not loose its purpose after we learned to properly use the language. It is
written in such a way that even for experienced R users it is a useful way to
quickly look-up functions and methods that maybe they do not use very often. I
sometimes forget how to use certain functions and having such a book on my
office bookshelf will certainly help me in these frustrating situations. So I
think it will become part of the set of references that future R user will use
on a regular basis.