top of page
0C7C396F-158E-4A61-8F9A-B0B6F603D7A4.png
Writer's pictureKane Schwarz

Guide to Evaluating Players using Microsoft Excel


Spencer Strider warming up for the Atlanta Braves; CC by license 2.0

Like many of the team members here at The Drummey Angle, I am a current college student nearing graduation, aspiring to have a future career in baseball operations. As graduation inches ever so closely, I have recently undergone the process of applying for entry-level positions within MLB front offices. Without having ventured into this space before looking for employment, I was prepared for anything. Application questions and requirements range based on the position, but a recent application question piqued my interest: “Who do you think are the five best starting pitchers in MLB right now?”. This is a very open-ended question, but similar to many of the questions I have seen on applications related to baseball ops. This specific question was to be answered disregarding outside variables like age, contract status, etc.. Given that, I wanted to approach this question statistically based off of the 2023 regular season. As many readers can relate to the application process, and may come across a similar question, I thought it would be useful to explain how I went about answering this through Excel so that this simple method can be easily repeated by someone following the same path.


My approach to this question was simple. I wanted to import relevant FanGraphs data from the 2023 season into Excel to help me arrive at a conclusion. To start, I created a custom leaderboard within FanGraphs (via paid subscription) of starting pitchers from the 2023 regular season. Building this leaderboard, I took into consideration the 12 metrics that I use most often to evaluate pitching:


- FIP

- ERA

- WHIP

- K/BB

- fWAR

- H/9

- opp. BABIP

- opp. Swing %

- opp. Contact %

- Zone %

- Barrel %

- Hard Hit %


The list of metrics above by no means is a ‘one-size-fits-all’ way of evaluating starting pitchers. You can just as easily incorporate the metrics you place the most importance in, like Statcast data for example.


After creating a leaderboard with the metrics I selected, I was able to import that data (with corresponding pitcher names and teams) via a CSV file into Excel to begin the real work. Initially, each pitcher is assigned a row within the dataset, and each column represents a metric and that pitcher’s output for the respective metric. For example, here is what that looks like with opp. BABIP:




Given this data, I wanted to take all of the metrics into consideration, weighted evenly. To do this, I utilized the RANK function within Excel. Next to each metric column, I inserted an empty column titled “Rank”. The notation for the function goes as follows:


=RANK(number,ref,[order])


Explanations on how to use the RANK function are readily available online, but below I have provided an example of what the formula for the first cell in the “Rank” column looks like next to BABIP for reference:




In my case, column C represents BABIP, and Column D represents the respective pitcher’s rank amongst others in regard to BABIP. The first cell in each column is the title, so data begins with the second row, hence seeing D2 on the left and C2 within the number argument. The reference argument ranges from the first row of data (C2) to the last row (C45), and it’s important to fix that range of cells using the $ sign in front of the column headers ( C ) and row headers (2,45). The lower the BABIP, the higher the rank, which is denoted by the 1 in the order argument representing ascending order. With the RANK function in place, here is what the “Rank” column looks like as it corresponds to BABIP within my Excel sheet:




You could very easily sort this “Rank” column solely based on BABIP so that the starting pitchers with the best ranks for BABIP are shown at the top. However, I wanted to take into consideration 11 more metrics. To do this, I simply took the same process used for BABIP and applied it to the other metrics, giving each column an adjacent “Rank” column, but changing the order argument based on each metric. For instance, while ascending order applies for BABIP, descending order is necessary for other metrics like fWAR and Swing %. To create “Rank” columns in descending order, all you have to do is change that 1 to a 0. Here is an example of what that looks like within my Excel sheet:




Similar to before, Column L represents fWAR, and Column M represents each respective pitcher’s rank in fWAR amongst other starters. As opposed to before, a 0 in the order argument represents descending order, as a higher fWAR corresponds to better performance, unlike metrics like BABIP.


After establishing each metric and each pitcher’s rank for the corresponding metric, there were only a few more steps I went through to come to my conclusions. When looking at the Excel sheet in full, I wanted to provide some more aesthetic to each “Rank” column so that one could easily identify which starting pitchers finished within the top 10 for each metric without having to sort a single column by itself. To accomplish this, you can apply what is called Conditional Formatting to each “Rank” column. First, highlight whichever “Rank” column you want to apply this to by clicking on top of the column. Once highlighted, the column should look like this:




Within the Styles pane of Excel is where one can find Conditional Formatting. To differentiate the top 10 pitchers for each metric, I selected the Conditional Formatting option, navigated to Highlight Cells Rules, and from there selected the Less Than option. You are then directed to the following screen:



As shown above, I have it set to highlight cells that are less than 11 (top 10) with Light Red Fill with Dark Red Text. The highlight setting is just based on personal preference, but the red is a useful default. Once you apply this setting to the “Rank” column, you can do the same for the other “Rank” columns so that anyone viewing the Excel can easily scroll through the sheet to find the top 10 pitchers for each metric. Here is what that looks like applied across the entire dataset:




To complete this entire process, there is only one more step I went through to bring me to some conclusions about the top 5 starting pitchers in the game right now. After inserting an empty column next to Team (right-click on the BABIP column and select Insert), I titled that column “Average Rank”. This column intends to provide a number that evenly weighs each metric. Given my “Rank” columns for all 12 metrics, the best way to do this is to take each rank value from each pitcher and find the average overall rank. I utilized the simple AVERAGE function within Excel to complete this. Within the second cell of the “Average Rank” column, I provided the following function:



Each column header represents a “Rank” column, and each row header represents the rank corresponding to the respective starting pitcher for that row (in this case, row 2 represents Gerrit Cole). Then take this function and apply it to the rest of the “Average Rank” column using Autofill (double-click the “plus” sign at the bottom right of the second cell in “Average Rank”, in this case C2).




Now, the “Average Rank” column can be used to evaluate each pitcher in respect to the 12 metrics taken into consideration, weighted evenly. The lower the “Average Rank”, the better that pitcher performed in 2023. An easy way to see pitchers organized from best to worst is to highlight the “Average Rank” column, navigate to the Editing pane within Excel, select Sort and Filter, and choose Sort Smallest to Largest. One last aesthetic component you can implement would be to apply Conditional Formatting to the “Average Rank” column to have it look similar to this:



You can easily do the same by simply highlighting the column, navigating to Conditional Formatting, selecting Color Scales, and applying whichever style you see fit (in my case, Red-Yellow-Green). And with that, the Excel evaluation is complete! The finished product comes out as such:




According to my Excel sheet and the calculations within, the top 5 pitchers from the 2023 regular season in respect to the 12 metrics taken into consideration were Gerrit Cole, Zack Wheeler, Spencer Strider, Pablo Lopez, and Zach Eflin. Now, there is always room for debate in discussions like these, especially in one surrounding the “five best starting pitchers in MLB right now”. While I included the first four pitchers listed in my answer on the application, Zach Eflin has a somewhat loose case for being a top 5 pitcher in the game, so I turned to historical data to help me with determining the final pitcher on my list: George Kirby, who comes in at 11 on the “Average Rank” column.


By writing this article, I hope to have provided a simple way to evaluate players using multiple metrics within a commonly used application like Microsoft Excel. One can use this same method to evaluate average metrics over a range of years as opposed to a single season like this example. This method can also be applied to not just pitchers but position players as well. Say instead of the 12 pitching metrics I used in my example, you want to evaluate hitters based on OPS+, barrel rate, fWAR, and wRC+. The same steps outlined above can be taken to do just that, made even easier with fewer metrics. While dealing in more complex data management applications like Tableau or RapidMiner can be overwhelming, I sought to provide a simple way to do much of the same evaluations within Excel. Whether it’s for a job application or a passion project, I hope baseball fans on all ends of the spectrum can take this simple method and apply it themselves.


Sources:

FanGraphs.com

Corporatefinanceinstitute.com

Automateexcel.com

Microsoft Excel


Comments


bottom of page