How To Organize Product-Related Assets And Collaborate Better

So, you start working on a new product. It’s crucial to set a well-organized environment — that is, the space where you and your team interact with all product-related files and documents — right away. The amount of your assets will only grow with time, and it becomes almost unbearable to find and fix the…

Introduction to R and RStudio

Introduction to R and RStudio – SitePointSkip to main contentFree JavaScript Book!Write powerful, clean and maintainable JavaScript.RRP $11.95 With increased computing power comes increased access to large amounts of freely accessible data. People are tracking their lives with productivity, calorie, fitness and sleep trackers. Governments are publishing survey data left and right, and companies conduct audience testing that needs analyzing. There’s a lot of data out there even now, ready to be grabbed and looked at.
In this tutorial, we’ll look at the basics of the R programming language — a language built solely for statistical computing. I won’t bore you with Wikipedia definitions. Instead, let’s dive right into it. In this introduction, we’ll cover the installation of the default IDE and language, and its data types.
Installing
R is both a programming language and a software environment, which means it’s fully self-contained. There are two steps to getting it installed:
Both are free, both open source. R will be installed as the underlying engine that powers RStudio’s computations, while RStudio will provide sample data, command autocompletion, help files, and an effective interface for getting things done quickly. You could write R code in simple text files as in most other languages, but that’s really not recommended given how many commands there are and how complex things can quickly get.
After you’ve installed the tools, launch R Studio.

IDE Areas
Let’s briefly explain the GUI. There are four main parts. I’ll explain the default order, though note that this can be changed in Settings/Preferences > Pane Layout.
The Editor

The top left quadrant is the editor. It’s where you write R code you want to keep for later — functions, classes, packages, etc. This is, for all intents and purposes, identical to every other code editor’s main window. Apart from some self-explanatory buttons, and others that needn’t concern you at this starting point, there’s also a “Source on Save” checkbox. This means “Load the contents of the file into my console’s runtime every time I save the file”. You should have this on at all times, as it makes your development flow faster by one click.
The Console

The lower left quadrant is the console. It’s a REPL for R in which you can test out your ideas, datasets, filters, and functions. This is where you’ll be spending most of your time in the beginning. Here’s where you verify that an idea you had works before copying it over into the editor above. This is also the environment into which your R files will be sourced on save (see above), so whenever you develop a new function in an R file above, it automatically becomes available in this REPL. We’ll be spending a lot of time in the REPL in the remainder of this tutorial.
History / Environment

The top right quadrant has two tabs: Environment and History.
Environment refers to the console environment (see above) and will list, in detail, every single symbol you defined in the console (whether via sourcing or directly). That is, if you have a function available in the REPL, it will be listed in the environment. If you have a variable, or a dataset, it will be listed there. This is where you can also import custom datasets manually and make them instantly available in the console, if you don’t feel like typing out the commands to do so. You can also inspect the environment of other packages you installed and loaded. (More on packages at a later time.) Go ahead and play around with it; you can’t break anything.
History lists every single console command you executed since the last project started. It’s saved into a hidden .Rhistory file in your project’s folder. If you don’t choose to save your environment after a session, the history won’t be saved.
Misc

The bottom right panel is the misc panel, and contains five separate tabs. The first one, Files, is self-explanatory.
The Plots tab will contain the graphs you generated with R. It’s here that you can zoom, export, configure and inspect your charts and plots.
The Packages tab lets you install additional packages into R. A brief description is next to each available package, though there are many more than those listed there. We’ll go through package repositories in a later post.
The Help tab lets you search the incredibly extensive help directory and will automatically open whenever you call help on a command in the console. (Help is called by prepending a command name with a question mark, like so: ?data.frame.)
Finally, the Viewer is essentially RStudio’s built-in browser. Yes, you can develop web apps with R and even launch locally hosted web apps within it.
Built-in datasets
In the text below, whenever I mention using a command, assume this means punching it into the console. So, if I say “We look at the help for DataFrames with ?data.frame”, you do this:

RStudio comes with some datasets for new users to play around with. To use a built-in dataset, we load it with the data function, and supply an argument corresponding to the set we want. To see all the available built-in sets, punch in data(), without an argument.

Looking at the list of available datasets, let’s load a very small one for starters:
data(‘women’)

You should see the women variable appear in the Environment panel, though its second field says . A promise in this case merely means “The data will be there when you actually need it”. We told R to load this set, but we haven’t actually used it anywhere, so it didn’t feel the need to load it fully into memory. Let’s tell R we need it. In the console, print out the entire set by simply calling this:
women

This is equivalent to:
print(women)

Note: we’ll be using the former approach, simply because it’s less typing. Remember: in R, the last value that’s typed out without being an expression (like assigning or summing something) is what gets auto-printed to the console.
The numbers will be produced in the console, and the Environment entry for women should change. You should be able to see the data in the environment panel now, too, by clicking the blue expand arrow next to the variable name.

This set only has 15 entries, and as such offers nothing of value, but it’s good enough for playing around in.
To further study the set you’re dealing with, there are several functions to keep in mind (a demonstration of each can be seen below explanations):
nrow/ncol will list the number of rows/columns respectively.
summary will output a summary about the set’s columns. In the case of the women set, we have two numeric columns (both columns are numeric, or in other words, each column is a numeric vector; more on data types and vectors later). And R knows that, when you ask it for an analysis of a numeric vector, it should give you the typical values for such collections: the minimum value in the set, the mean (average) between the minimum and the mean, the mean (average of all values), the mean between the mean and the maximum, and the maximum, the largest number in the column. It does this for both height and width. For different types of vectors (like ones where every element is a word instead of a number) the output is different.
str is a different kind of summary. In fact, str stands for “structure” and it outputs a summary of a dataset’s structure. In our case, it will tell us that it’s a “data.frame” (a special data type we’ll explain later) with 15 obs (observations or rows) and two variables (or columns). It then proceeds to list all the columns in the DataFrame with some (but not all) of their values, just so we get a grasp on the kind of values we’re dealing with.
dim gives you the dimensions of a dataset. Calling dim(women) gives us 15 2, which means 15 rows and two columns. length can be used to count the number of vertical elements in a set. In vectors (see below), this is the number of elements; in data sets like women, this is the number of columns:
> nrow(women)
[1] 15
> ncol(women)
[1] 2
> summary(women)
height weight
Min. :58.0 Min. :115.0
1st Qu.:61.5 1st Qu.:124.5
Median :65.0 Median :135.0
Mean :65.0 Mean :136.7
3rd Qu.:68.5 3rd Qu.:148.0
Max. :72.0 Max. :164.0
> str(women)
‘data.frame’: 15 obs. of 2 variables:
$ height: num 58 59 60 61 62 63 64 65 66 67 …
$ weight: num 115 117 120 123 126 129 132 135 139 142 …
> dim(women)
[1] 15 2

You’ll be using these functions a lot, so I recommend you get familiar with them. Load some of the other datasets and inspect them like this. There’s no need to know them by heart. This tutorial and the help files will always be around for reference, but it’s nice to be fluent in them anyway.
Data Types
R has some typical atomic data types you already know about from other languages, but it also provides some more statistics-inclined ones. Let’s briefly go through them. While explaining these types, I’ll talk about assigning them. Assigning in R is done with the “left arrow” operator or myNum class(myNum)
[1] ‘numeric’

You can coerce (change type of) numeric string values into numeric types, like so:
> myString class(myString)
[1] ‘character’
> myNumber myNumber
[1] 5.6
> class(myNumber)
[1] ‘numeric’

There’s also a special number Inf which represents infinity. It can be used in calculations:
> 1/0
[1] Inf

Another “number” is NaN, which stands for “Not a Number”. This is what you get when you do something like 0/0.
Integer
Integers are whole numbers, though they get autocoerced (changed) into numerics when saved into variables:
> myInt class(myInt)
[1] ‘numeric’

To actually force them to be integers, we need to invoke a function that manually coerces them, called as.integer:
> myInt class(myInt)
[1] ‘integer’

You can prevent autocoercion by setting integers with an L suffix:
> myInt = 5L
> class(myInt)
[1] ‘integer’

Note that, if you give R a number that’s greater than what its memory can store, it autocoerces it into a real number, even if you put L at the end:
> myInt class(myInt)
[1] ‘numeric’
> myInt
[1] 2.479827e+27

But if you then try to coerce that number into an integer, R will discard it because it simply can’t make integers that big. Instead of a number, you get “NA”, which is a special type in R indicating “Not Available”, also known as a missing value:
> myIntCoerced myIntCoerced
[1] NA
> class(myIntCoerced)
[1] ‘integer’

The NA is still a type of “integer”, but one without value.
Note that, when coercing numerics into integers, decimal places get lost. The same applies to coercing from numeric decimal strings:
> myString myNumeric myInteger1 myInteger2 myInteger1 == myInteger2
[1] TRUE
> myInteger1
[1] 5

Complex
Explaining complex numbers is a bit outside the scope of this tutorial, particularly if you weren’t exposed to them in school, but if you’re curious, you can find out more about complex numbers on Wikipedia. They take the form of a + bi where a and b are real numbers and i is imaginary. In R, they’re constructed with a special complex function:
> myComplex myComplex
[1] 3292+8974892i
> class(myComplex)
[1] ‘complex’

You won’t be needing those nearly as often as you might need the other types, but if you want to know more about the complex function, just call for help on it: ?complex.
Logical
Logical types (booleans) are the same as in most other languages and can be two things: either true, or false. True can be represented with TRUE or T, while false is, predictably, FALSE or F:
> TRUE == T
[1] TRUE
> myBool myBool == T
[1] TRUE
> myComparison 6
> myComparison == FALSE
[1] TRUE
> class(myComparison)
[1] ‘logical’

Whenever you create an expression, the result of which is a “yes” or “no” value, you get a TRUE or FALSE — like in the case of 5 > 6 above. 5 is not greater than 6, so the expression becomes FALSE. Comparing myComparison to FALSE thus yields TRUE because the myComparison variable indeed contains a value of FALSE.
When needed by a function, logical values will be coerced into numerics. This means that, if I write 1 + TRUE, the console will produce 2, whereas 1 + FALSE gives 1. Likewise, we can easily coerce other types into logicals (as.logical(myVariable)). Any numeric or integer with a value not equal to 0 or NA will give TRUE. 0 and 0L will give FALSE. Strings like “True”, “TRUE”, “true” and “T” will be turned into TRUE, “False”, “FALSE”, “false” and “F” will be FALSE. Any other string will coerce into a logical NA value.
Higher Types
Higher types are types composed of the lower ones.
Vectors and Lists
The most essential of all, the vector, is a collection of elements of the same type. In our earlier dataset example, one column of the women dataset was a numeric vector, meaning it was a collection with only numeric values in it. A vector can only have elements of the exact same type. Vectors can be created with the vector function, but are usually created with the shorthand c (concatenate) function:
> myVector class(myVector)
[1] ‘character’
> myVector
[1] ‘Hello’ ‘World’ ‘Third Element’

We can see here that the vector’s class is “character”, meaning that it contains only character type values. If we print it out by just calling its variable name, we get all three elements and a [1]. The [1] means literally: “I am outputting the contents of your vector. The first element on this line is element number 1 in the vector”.
What you see in [] depends entirely on the size of your console panel and the length of the array. For example:
> myVector myVector
[1] ‘One’ ‘Two’ ‘Three’ ‘Four’ ‘Five’ ‘Six’ ‘Seven’
[8] ‘Eight’ ‘Nine’ ‘Ten’ ‘Eleven’ ‘Twelve’ ‘Thirteen’ ‘Fourteen’
[15] ‘Fifteen’

The number in square brackets simply means “The element after me is Nth in the set”. It’s there purely to make the output more readable, and doesn’t affect actual data.
Note that vectors are strictly one-dimensional. You can’t add another vector as an element inside an existing vector; their elements get merged into one:
> v1 v2 v3 v3
[1] ‘a’ ‘b’ ‘c’ ‘d’ ‘e’ ‘f’

You can generate entire numeric vectors by specifying a range:
> myRange myRange
[1] 1 2 3 4 5 6 7 8 9 10

Lists are just like vectors, only they don’t have the limitation of being able to hold elements of the same type exclusively. They’re built with the list function or with the c function if one of the elements you’re adding is a list:
> myList class(myList)
[1] ‘list’
> myList
[[1]]
[1] 5

[[2]]
[1] ‘Hello’

[[3]]
[1] ‘Worlds’

[[4]]
[1] TRUE

As with vectors, the one-dimensionality rule applies. Adding a list into another will merge their elements.
The [[N]] output means “The first element of this list is a vector with a single element 5, the second element of this list is a vector with a single element Hello… etc.” Appending the [[N]] to the variable that holds the list actually returns the element. We can check this easily:
> class(myList[[1]])
[1] ‘numeric’
> class(myList[[2]])
[1] ‘character’
> class(myList[[3]])
[1] ‘character’
> class(myList[[4]])
[1] ‘logical’

Data.Frame
DataFrames are essentially tables with rows and columns, much like spreadsheets. The women dataset we loaded above was a DataFrame. You can access individual columns of DataFrames by using the $ operator on the variable, followed by the column name:
> women$height
[1] 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72

The result gets returned as a numeric vector. Check its class with class(women$height).
If the column name contains spaces, you can wrap it in quotation marks (women$”Female Height”), but you can also access a column by its numeric position in the list of columns. For example, we know height is the first column:
> head(women[1])
height
1 58
2 59
3 60
4 61
5 62
6 63
> head(women[[1]])
[1] 58 59 60 61 62 63

The head function tells R to only return the first six results. This is so we keep our console nice and scroll-free, which is excellent for brief looks into datasets without printing them out in their entirety. (The opposite can be achieved with tail, which prints out the last six results.)
We can see here that, when accessing the column with single-square-bracket [1], we get a DataFrame but with one less column. If, however, we access it with the double-square-bracket [[1]], we get a numeric vector of heights. The data returned by using the single bracket stays the same type as the parent data, while the double bracket accessor targets the specific values in that column and returns them in their most rudimentary form, a numeric vector.
We create DataFrames with the data.frame function:
> men head(men)
height weight
1 50 150
2 51 151
3 52 152
4 53 153
5 54 154
6 55 155

Here, we created a sample men dataset not unlike the women set from before.
If we want to get just the names of the columns, we use the names function. We can even assign a value to it and thus change the column names:
> names(men)
[1] ‘height’ ‘weight’
> names(men) head(men)
Male Height Male Weight
1 50 150
2 51 151
3 52 152
4 53 153
5 54 154
6 55 155

Matrix
Matrices are multi-dimensional vectors. They are like DataFrames, but can only contain values of the same type. They’re created with the matrix function and need the number of rows and columns as parameters, and the values to place into these slots:
> m m
[,1] [,2] [,3] [,4] [,5]
[1,] 1 5 9 13 17
[2,] 2 6 10 14 18
[3,] 3 7 11 15 19
[4,] 4 8 12 16 20

What happens if the number of values provided doesn’t match the number of cells?
> m m
[,1] [,2] [,3] [,4] [,5]
[1,] 1 5 9 13 17
[2,] 2 6 10 14 18
[3,] 3 7 11 15 19
[4,] 4 8 12 16 20

We get a warning, but the matrix is created anyway, with all the extra values simply discarded. If there are fewer values than cells, the values provided get recycled until the matrix is full:
> m m
[,1] [,2] [,3] [,4] [,5]
[1,] 1 5 9 13 1
[2,] 2 6 10 14 2
[3,] 3 7 11 15 3
[4,] 4 8 12 16 4

Much as DataFrames have the names attribute/function, matrices have a dim (dimension) one. Changing this property can mutate the form of a matrix:
> m m
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
> dim(m)
NULL
> dim(m) m
[,1] [,2] [,3] [,4] [,5]
[1,] 1 4 7 10 13
[2,] 2 5 8 11 14
[3,] 3 6 9 12 15
> dim(m)
[1] 3 5
> dim(m) m
[,1] [,2] [,3]
[1,] 1 6 11
[2,] 2 7 12
[3,] 3 8 13
[4,] 4 9 14
[5,] 5 10 15
> dim(m)
[1] 5 3
> dim(m) m
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Here, we’ve created a numeric vector of 15 sequential numbers (in such cases, we can omit the c). The dim attribute didn’t exist on it, as evident by the dim function returning NULL, so we changed it by assigning a numeric vector of 3, 5 to it. This resulted in a reshuffling of the elements to fit into the newly constructed matrix. We then changed the dimensions again by inverting the number of columns and rows to 5, 3, which again produced a different matrix. Finally, nullifying the dimensions produced the numeric vector from the beginning.
It’s also possible to combine/expand/alter vectors, DataFrames and matrices by using cbind and rbind:
> v x bound bound
v x
[1,] 1 6
[2,] 2 7
[3,] 3 8
[4,] 4 9
[5,] 5 10
> bound bound
[,1] [,2] [,3] [,4] [,5]
v 1 2 3 4 5
x 6 7 8 9 10

Factors
Factors are vectors with labels. This is different from character vectors, or even numeric ones, because they’re what R folks call “self describing”, allowing R’s functions to automatically make more sense of them than they would of the other types. They’re built with the factor function, which needs to be fed a vector as an argument:
> f f
[1] Hello World Hello Annie Hello World
Levels: Annie Hello World
> table(f)
f
Annie Hello World
1 3 2

Levels lists out the unique elements in the factor. The table function grabs a factor and builds a table consisting of the various levels in the factor, and the number of their occurrences in the factor.
I personally haven’t found a use for factors yet in my projects, but I’m learning about them.
Conclusion
In this tutorial, we’ve covered the basic data types in R and the essentials of using RStudio. You’re now armed with all the knowledge you need to start some basic data operations. Remember that all the functions we covered above are fully searchable in the help files with ?function, where “function” is the function name.
When learning a programming language, it’s customary to teach data types first, logical operators second, and control structures third, before moving into advanced things like functions and classes. But in this tutorial, I believe we’ve laid a decent enough foundation to jump straight into the fire and learn by example.
Related ArticlesUsing Python to Parse Spreadsheet DataProgrammingBy

What Google’s New Page Experience Update Means for Images on Your Website

It’s easy to forget that, as a search engine, Google doesn’t just compare keywords to generate search results. Google knows that if people don’t enjoy their experience on a web page, they won’t stay on the page long enough to consume the content — no matter how relevant it is. As a result, Google has…

How to Create a Simple Web-Based Chat Application

In this tutorial, we will be creating a simple web-based chat application with PHP and jQuery. This sort of utility would be perfect for a live support system for your website. This tutorial was updated recently to make improvements in the chat app.  Introduction The chat application we will be building today will be quite…

The New Rules of Minimal Design: Best Practices, Examples, and Tips

While there is nothing new about minimalism as a design trend, the application and techniques are ever-evolving. Some of the most notable changes with modern minimal designs are the use of space, color, and typography treatments. These design elements can make the overall aesthetic seem new and fresh or show a more dated style. Here,…

7 Reasons NOT to Use a Static Site Generator

7 Reasons NOT to Use a Static Site Generator – SitePointSkip to main contentFree JavaScript Book!Write powerful, clean and maintainable JavaScript.RRP $11.95 Static site generators (SSGs) are popular and offer many benefits, but this article discusses reasons why they may not be a suitable replacement for your content management system (CMS).
In my previous article, we discussed how your site could benefit from using a static site generator:

A static site is a collection of pages contained in basic HTML files. You could hand-write these in a text editor, but managing assets and repeated elements such as navigation can become problematic.
A content management system stores page content in a database and provides facilities to edit and apply themes. Management becomes easier at the expense of flexibility, performance, server requirements, security, and backups.
A static site generator is a compromise between using a hand-coded static site and a full CMS. You generate the full site once using raw data (such as Markdown files) and templates. The resulting set of files is transferred to your live server.
The term “Jamstack” (JavaScript, APIs, and Markup) is used in relation to static sites. It refers to the rise in frameworks, serverless functions, and associated tools which generate static files but allow complex interactivity to be added.

Popular static site generators include Jekyll, Eleventy, Gatsby, Hugo, and Metalsmith. SSGs are available for most languages; see StaticGen for many more.
SSGs appear to offer the benefits of both CMS and static worlds, but they may not be suitable for every project …
1. You’re On Your Own
You won’t get far using a static site generator without some development expertise. The process is more difficult than using a CMS, there are fewer resources, and you could struggle to find pre-built plugins and templates.
Contrast that with WordPress. A non-technical user may require installation assistance but, once that’s complete, they can edit a site and install one of the many thousands of themes and plugins available. They may not have the best bespoke website, but they’re running with minimal intervention.
2. Choice Paralysis
There are many static site generators, but even the most popular tools are used by a tiny proportion of the web community. You’ll need time to research, investigate, and evaluate the options. One of the first SSGs was the Ruby-based Jekyll but, while you don’t necessarily require Ruby expertise, it will help if you’ve used the language before.
There are many CMSs, but there’s one obvious choice: WordPress. It powers more than 40% of the Web, so help is abundant. Again, it will help if you have some PHP experience, but even a non-developer can create a reasonable website using off-the-shelf themes and plugins.
3. The Initial Setup Time
Creating your first static site will take time. You’ll need to learn the build process, and much of the template code will need to be developed. Deployment scripts may also be necessary.
Developing a custom CMS theme can also be complicated, but pre-built templates are available and assistance is easier to find. Further development may not be required following the initial installation.
4. No Administration Interface
Clients may be cautious when faced with a complex CMS interface. Asking them to create and edit a set of Markdown files may terrify many. You can make the process easier by perhaps:

using their existing CMS as an SSG data source, or
providing simpler workflows, such as editing Git-based files in StackEdit or Hackmd.io.

But this will further impact your initial development time.
5. Website Consistency
Static sites are flexible: anything contained within source content can be rendered on a web page. Users may be able to include scripts, widgets, or numerous undesired items.
A CMS can be configured to constrain the user. Content is normally bound to a database with specific fields, so administration panels prompt the user to enter a title, body content, excerpts, featured images, and so on. Even if the user enters something in an unexpected field, it won’t appear on the website unless it’s implemented within the theme template.
6. Managing Large Sites
Consider a website with thousands of pages, daily content publications, real-time breaking news, and dozens of authors in multiple locations. It’s possible to manage content using a static site generator, but:

Content editing and publishing can be more awkward. Editors may require access to the Git repository or shared folders rather than a simple web or app interface.
Real-time updates are delayed because the site must be rebuilt, tested, and deployed.
Build times can increase rapidly and deployment could become cumbersome.

Static site generators are perhaps best suited to sites containing no more than a few hundred pages with a couple of new posts every week. Automated build and deploy processes will be required, and you may reach a point where a CMS becomes a more practical option.
7. Server-side Functionality
Static sites are perfect for content pages, but the situation becomes more challenging when you require user logins, form filling, search facilities, discussion forums, or other server and database interactivity. Options include:

Adding a third-party, client-side component such as Angolia search or Disqus comments.
Creating your own server (or serverless) APIs which can be consumed by client-side JavaScript to add required features.
Generating pages containing