The Basics of a Pivot Table

Pivot tables are awesome. You can take a range of data, apply a little magic to it, then manipulate that data in so many ways to make analyzing and predicting much easier. It’s a great way to compare and manipulate data on the fly in a meaningful matter. They seem like these really difficult and esoteric things to create, and people who can use them well are very much valued. While a Pivot Table in Excel can seem quite complex, I want to let you in on a little secret:

The Basics of a Pivot Table are Easy

The first thing you need is a large range of data. For the purposes of this exercise, we have four friends who are competing for the first six months of 2017 to see who gets the highest step count (fitness-minded fellows that they are…) I’m not going to post the whole table, as 180 entries would just be eye-crossingly boring. And you know, that’s the point. But the source table is going to start something like this:

Date Grady Rich Derek John
1/1/2017 10532 12637 8679 12129
1/2/2017 11329 18430 9393 9221
1/3/2017 12475 16594 14010 13990

So, we have five columns – one for the date and one for each of the contestants. They record their step counts each day. But at the end of the contest, they want to analyze that data to see how they did not only by day, but by month or week.

Without Pivot Tables, this would be a royal pain. So, let’s make one.

To make a Pivot Table

  1. Select any cell in a data rage that includes a heading for each column in the top row.
  2. Activate the Insert tab.
  3. In the Tables group, click the PivotTable button to open the Create PivotTable dialog box.
  4. In the Table/Range box, verify that the range in the box is the range you want. By default, it will display the continuous range that has the selected cell in it, but you can also select the cells with the mouse, if you wish.
  5. Select a location for the PivotTable. You can place the PivotTable in a new or existing worksheet. I almost always go with a new worksheet.
  6. Click OK.

Add Pivot Table Fields

You can add fields to a PivotTable to specify the data you want to display. The Fields of the source data appear in the PivotTable Field List task pane.

To add field drag a relevant field from the top of the PivotTable Field List to one of the four areas at the bottom of the task pane. You can add more than one field to an area, and you don’t need to add all fields to the table.

This example was a fairly simple one, as the relevant headers were merely dates and the contestants’ names. I dragged the Date field to the Row box. I then dragged each of the contestants’ names to the Values box.

As of this writing, the current version of Excel does something pretty nifty and will automatically collapse the dates by month, using the sum of the values as the default.

See the plus signs beside each name? From here, you can click on the plus sign if you want to see the value for each date.

But let’s say you wanted to give this to someone who isn’t really very patient with learning how to manipulate Excel, but still needs to filter the data sometimes. What can you do to make this easy on the user?

Slicers in Pivot Tables

Slicers are just buttons that work as filters. To create a Slicer, go to the Pivot TablesTools|Analyze tab. In the Filter group, click on Insert Slicer.

I’ve chosen Months as the option here because the guys want to see how they do month by month as well as how well they’ve done in total.

When we click on each month, we will then display the data only for that month. The beauty here is that the source data is still safe. You can manipulate how the data is displayed in a Pivot Table, but the source data is always unchanged. You will notice also, that in a Pivot Table, the Grand Total always reflects the data that is being currently displayed – no extra formulas or functions to write. Just change on the fly to display what you need!

These are really just the basics of Pivot Tables. You can get a lot more complex with them, if you want to, and I urge you to play and explore. But these basics should be enough to help amaze your co-workers and excite your boss.

As always, if you have any questions, just comment and I’ll try to help you out.

The Secret to Using SUMIF

Today we’re going to learn about some features of Excel and how they interact.

  • Tables and how they’re named
  • SUMIF

We’re going to do this with a little problem our hypothetical boss has given us. It’s the end of the first quarter. She would like to know how much each client has paid us for this quarter.

We’re going to get all of this information from a little ledger she sent us. She’s no Excel expert, but she’s meticulous about records. This means we have plenty to work with just from this little table.

Date Client Source Work Type Payment
1/2/2017 Hogwarts Nepotism IT Consulting $ 1,729.00
1/4/2017 Torchwood Advertising Copywriting $ 2,020.00
1/9/2017 Bene Gesserit Competitive Bid Training $ 1,900.00
1/9/2017 The Royal Frog Trampling Institute Competitive Bid IT Consulting $ 3,829.00
1/11/2017 The Monitor’s Guild Competitive Bid Training $ 4,983.00
1/31/2017 Church of All Worlds Competitive Bid IT Consulting $ 4,422.00
2/1/2017 Torchwood Advertising Training $ 1,772.00
2/3/2017 Council of Elrond Advertising Training $ 1,029.00
2/3/2017 The Royal Frog Trampling Institute Competitive Bid Training $ 2,063.00
2/5/2017 Church of All Worlds Competitive Bid Copywriting $ 2,929.00
2/8/2017 Hogwarts Nepotism Copywriting $ 1,943.00
2/8/2017 The Monitor’s Guild Competitive Bid Training $ 3,872.00
2/10/2017 Bene Gesserit Competitive Bid IT Consulting $ 5,026.00
2/14/2017 The First Order Nepotism IT Consulting $ 2,032.00
2/28/2017 Xavier Institute for Gifted Youngsters Nepotism Training $ 3,098.00
3/2/2017 Bene Gesserit Competitive Bid Training $ 4,465.00
3/5/2017 Torchwood Advertising Copywriting $ 1,072.00
3/5/2017 Torchwood Advertising Training $ 1,943.00
3/6/2017 Hogwarts Nepotism IT Consulting $ 2,063.00
3/7/2017 The Royal Frog Trampling Institute Competitive Bid IT Consulting $ 2,972.00
3/8/2017 Torchwood Advertising Copywriting $ 3,872.00
3/13/2017 Bene Gesserit Competitive Bid Training $ 5,026.00
3/15/2017 Hogwarts Nepotism IT Consulting $ 1,072.00
3/16/2017 Hogwarts Nepotism Training $ 2,075.00
3/30/2017 The Royal Frog Trampling Institute Competitive Bid IT Consulting $ 3,141.00
4/2/2017 The Monitor’s Guild Competitive Bid Copywriting $ 1,772.00

This is just a ledger that records the date, the client, how we got the client, what work we did for the client, and how much we were paid.

The first thing we will do is turn this array into a table. This will make the rest of the calculations easier.

Tables in Excel

To turn an array into a table in Excel:

  1. Select the range you would like to make a table. (For the purposes of this exercise, that’s going to be cells A1:E27)
  2. On the Home tab in the Styles group, click on Format as Table.
  3. Choose the table style you would like.
  4. Make sure that in the Format as Table dialog box, the My table has headers checkbox is checked.
  5. Click OK.

You have now formatted your array as a table. This is more than pretty formatting, however, but also means you have created an object with some specific functions.

Before we get to the problem our boss has asked us to solve, I want to do one more thing – name the table. If you are going to be using more than one table in a spreadsheet, and will be performing calculations from more than one table, it is a good idea to give each table a meaningful name. While we are not going to go quite that far in this exercise, it’s a good general habit.

To Name a Table in Excel:

  1. Make sure a cell in the table is selected.
  2. Look at the top of your screen at the different tabs for ribbons. The last one should be Design with a Table Tools as a label above it. Click on it.
  3. On the Table Tools:Design ribbon in the Properties group, you will see a Table Name text box.
  4. Type the name you want for your table and press <Enter> on your keyboard. (We’re calling this table “Ledger”)

Now that we have formatted the array as a table, we are going to look back at what our boss wants.

The first thing she wants is how much each client has paid. We will start by creating another little array. Then we will use the SUMIF function to add up how much each client has paid us. As we go through this, you will see why I prefer to do this as a table rather than just keeping the array as-is. For the purposes of this exercise, we’re presuming that we’ll be starting an array to the side of the Ledger table starting in cell G1. The formula we will use in cell H2 will be:

=SUMIF(Ledger[Client], G2, Ledger[Payment])

Let’s break this down, as it might look like gobbledygook at first.

The function we will use is SUMIF, so we start with =SUMIF.

The syntax for the Sumif function is:

=SUMIF(range, criteria, sumrange)

So, the range is the comparison. In this case, we want the function to look in the Client column of the Ledger table, which is written as Ledger[Client], with the table name and the column name in brackets right beside it.

Then we want to know the criteria for comparison. There are a couple of ways to do this. You could look for specific text by putting it in quotes. So, we could have written it as “Bene Gesserit” (You put quotes around text you’re searching for). Which begs the question, why did I use the cell reference G2 instead? Well, mostly because I am lazy. I want to be able to write this equation once, then use autofill to complete the rest of the array.

The last thing we want to know is where to look for the values for which we want a total. In this case it is in the Payment column of the Ledger table, so we use Ledger[Payment].

Client Total
Bene Gesserit =SUMIF(Ledger[Client], G2, Ledger[Payment])
The Royal Frog Trampling Institute
Torchwood
The Monitor’s Guild
Hogwarts
Church of All Worlds
Xavier Institute for Gifted Youngsters
The First Order
Council of Elrond

Pop quiz. If I did not set this up as a table, but left the array as-is, could I have used cell references instead?

My word, yes. In this particular example,

=SUMIF(Ledger[Client], G2, Ledger[Payment])

would become:

=SUMIF($B$2:$B$27, G2, $E$2:$E$27)

Notice that I made the cell references absolute? Again, lazy. I want to be able to write this formula once and copy down.

So, why would I not do it that way?

This is a personal choice thing, but I tend to try to give meaningful names to ranges of data I am manipulating so that I can think about what data I am trying to pull rather than being bogged down in making sure cell references match exactly. (Great at math, suck at arithmetic, what can I say?)

Tables are not the only way to do this. Stay tuned next week, where we’ll talk about Named Ranges and how to use them.

Please feel free to contact me with any questions you may have!

Zero and Zip Codes in Excel

Once upon a time, there was a completely random and fictional person who had to convert a mailing list from an Excel spreadsheet to a series of mailing labels via Mail Merge in Word. This completely random and fictional person truly was pretty good with MS Office, but had gotten a bit cocky and wasn’t quite as good as she thought.

This (honest!) completely random and fictional person had a client in Northern New England. The location will become important to the story later.

The spreadsheet had a list of ten thousand addresses. But the completely random and fictional person was undaunted. After all, the source spreadsheet looked wonderful – as well, laid-out and sensible a table as one could wish for when creating an address list:

Beautiful, right? It was going to be an easy mail merge.

(Y’all’ve guessed it wasn’t, I am sure.)

So the completely random and fictional person set up the mail merge and did something pretty stupid regarding printing out ten thousand labels. This completely random and fictional person neglected to do a print preview, set that bad boy to print and went for a cup of coffee and chat with the client before going back to the printer.

At the printer, the completely random and fictional person found ten thousand labels with addresses like this:

Can you spot the error? Check out those ZIP codes. ZIP codes in the US have a minimum of five characters. So why in the world didn’t the beautiful (and correct!) zip codes in the spreadsheet translate to the mail merge?

When you do a mail merge, what Word actually does is take the characters in the cell (or the results of a formula) from the spreadsheet and copies them to the proper field in the document.

Take a look at the actual contents of the zip code cell for Boris Badenov:

Even though the cell is set to display as a five-digit ZIP code, the value in cell F2 is Three Thousand Seven Hundred Sixty-six, not Zero-Three-Seven-Six-Six.

So, what’s the numeric difference between those two numbers? None at all. It is only as a string of characters, the actual Arabic numeric symbols, that this becomes important. When you enter a string of Arabic numeric characters into a cell in Excel, it immediately formats that string as a number and will strip out the numerically meaningless preceding zero.

When you format cells as a ZIP code, it will take that string of four characters, and add a zero for you to see in Excel, but it does not store that zero as an actual character.

Unless…

Instead of formatting the cells as numbers and a ZIP code, you format those cells as text. Then you can keep the preceding zero just fine.

The problem comes in when you hand someone an assignment to enter a bunch of addresses (including some New England addresses) into a spreadsheet. Unless the person doing data entry is thinking regarding data transfer to other applications, they’re not going to bother to think about doing anything other than just formatting the zip code as a zip code. They won’t give it another second’s thought.

If you’re going to be using Excel to store data that has a preceding zero that you will be using in mail merges or transferring to other programs, be sure to select that column of information, and format it as Text. (I don’t recommend the trick of the preceding apostrophe, i.e. ‘03766. While it does, indeed, cause Excel to format the data as text and only display the numbers, it plays hob with data transfers like mail merges!)

Nifty Tip: Using the CONCATENATE Formula

The CONCATENATE formula allows one to combine text from multiple cells into a single cell. The CONCATENATE formula is in the Text category in the Insert Function dialog box.

To add a blank character between words, type a space between quotation marks. This adds a blank character between the separated text. Whatever is between the quotes, be it a space, text or characters, it will show up in the concatenation.

So, if you were trying to create an email address at Dartmouth College, the formula would be:

=CONCATENATE(A3,”.”,B3,”@dartmouth.edu”)

Which would yield: Mickey.Mouse@dartmouth.edu

Tables in Excel and Why They Rock

Quick, when someone mentions the word “table” with reference to Excel, what do you think of?

Probably an array of some sort organizing rows and columns of data. While that’s not incorrect, necessarily, tables are object in Excel that can do so much more.

What are Tables?

Tables are specific objects. When you format a range of cells as a table, you’re telling Excel that you’re going to treat this range as a unit to display and manipulate your data.

So, let’s say you take a range with some data in it that looks like this:

We’re listing students, their Hogwarts houses, their favorite candies, and how Ordinary Wizarding Levels they achieved. We’re going to be doing market research on a new candy flavor, and we will be taking this sample data to try to decide how to target different demographics based on the information we have.

Why Use Tables?

We can manipulate the data in this step by step if we want to – apply color, borders, fill, filters, and so on. But there is an easier way.

Layout

A nice-looking array with banded rows and different formatting for the header isn’t just about the “pretty.” Differences in format, color and shade make the table easier and faster to read. That’s why your supervisor gets excited at well-formatted tables. The data is easier to analyze.

Let’s make this array a table.

  1. Select the range of cells you want to format as a table. In this case, it will be cells A1:E6
  2. On the Home tab in the Styles group, click on the Format as Table button.

  3. The table style gallery will appear. Choose a layout you like.
  4. A popup will appear asking if your table has headers and to verify the range of cells you want to make a table. Most of the time, it will have headers, so make sure you have checked it.

  5. Click OK.

You now have your table. Isn’t it pretty?

The pretty might be enough for you. After all, this is a decent layout, quick to do and easy to add to. Any time you add another record to this table, the range of the table automatically expands to include it, and you’ll notice the banded rows continue. It looks good automatically.

However, while this is nicely laid-out and easy to read, there’s more to tables than just looking good easily. There are some other functions that can be important.

Structured Referencing

For our example, we’re trying to analyze how important someone’s opinion might be based on two things – their Hogwarts House, and how many OWLs they’ve achieved.*

With tables, we can easily create this equation, then apply it to a large table.

For this example, we’re going to be listing the houses with Slytherin being the most influential with a score of 4, then Gryffindor with a score of 3, Ravenclaw 2 and Hufflepuff 1.

We’ll add column to our table called Influence. Then we write this equation in the first cell of the Influence column.

=IF([House]=”Slytherin”, 4, IF([House]=”Gryffindor”, 3, IF([House]=”Ravenclaw”, 2, IF([House]=”Hufflepuff”, 1, 0))))

Looks rather like a monster, doesn’t it?

What this is doing is a series of evaluations that will assign an influence integer based on the House name. Instead of referring to a specific cell in Column C (i.e. C3), it instead goes by the Structure of the table and just refers to the House field. Structured referencing means that we can perform calculations easily based on structure.

Now that nested IF equation is a bit of a pain in the butt to write, and I only want to do it once. Because I write the equation in the first cell of the Influence column in the table, it will automatically copy that equation down to each of the cells in the Influence column.

Now we are going to create our Influence Score based on the number of OWLs achieved times the Influence a particular entry has.

We will again create another column calling it InfluenceScore. We write our equation as =[OWLS]*[Influence], not as =SUM(E2*F2). This is where structured referencing comes in. We’re doing calculations based on the field names, not the specific cells.

You will also note that the field names are not case sensitive. When we press Enter, the equation will copy down the entire column. For eight records, this is not a particularly big deal. Autofill might serve as well. But if you have thousands of records, you might find that this comes in handy. (I sure do!)

Once we have our InfluenceScore, we can then click on one of the AutoFilter buttons on the right of InfluenceScore header to sort the table from highest score to lowest.

From there we can see that Albus Dumbledore is probably the most influential person to approach about candy marketing in the Wizarding World.

Conclusion

As you can see, formatting a range as a table can make sorting, filtering and evaluating date much easier. While this example only has a few records, if you need computer enhancement, you’re probably dealing with hundreds or thousands of data points. When you’re doing so, formatting data in tables for evaluation can make your job much, much easier!

* For the non-Harry Potter fan, an OWL is a test passed in a particular subject. The more OWLs you’ve achieved, the better student you probably are and more likely to have an influential job.

The Right Tool for the Right Job

image

Excel is a powerful program, no doubt.  When you need hard-core data analysis it’s a great tool.  Sadly, this is not how I often see Excel used in my professional life.

I see Excel used as a table creation tool for printed schedules — no calculations being performed at all.  I see Excel being used as a database.   This is forgiveable.  If you know what you’re doing, you can make a decent flat database with Excel. 

But that is exactly the problem.  A well-designed database is one that can be used by someone who does not necessarily know what they’re doing, but can enter data and pull reports easily.  This is where Excel falls down badly.  Hand this tool to someone who does not know what they’re doing, and you’d better be locking down a lot of functions on the user end, or you’re going to wind up with destroyed data.

I encourage people who use Excel for data tracking to look carefully at their data before making a decision about what application to use when doing it.  The two biggest criteria I see are record count and report complexity.

If your data is going to be involving millions of records, you want a database.  Excel will choke.  If you want complex data reporting across several data tables and records, you want a database, not Excel.

If you’re dealing with a few thousand records and need to display that data graphically to explain in a presentation, yeah, Excel is your tool and it works great.