Where Are Breast Cancer Lumps Usually Found?

I will answer this question using the SEER database

Photo by National Cancer Institute on Unsplash

This is a no-brainer question because its answer is present in most oncology textbooks.

I am writing this article today to answer this question using the Surveillance, Epidemiology, and End Results (SEER) database, which I have written two articles about it before.

Also, if you want to try reaching the answer yourself, and building some experience with using the SEER database to answer more cancer-related questions, you can start by learning how to gain access to the SEER database through watching this video.

Let’s begin!

After I logged into SEER*Stat 8.3.9, the software that was designed to help the SEER database users extract cancer data, I selected the case listing session, and then selected the database named “Incidence- SEER Research data, 9 Registries, Nov 2019 sub (1975–2017)”. This complex name means that the cancer-related data that I am extracting were submitted to the SEER database on November of 2019, and denotes data that were extracted from cancer patients who were diagnosed with malignant cancer between 1975 and 2017.

Next, I clicked on the “Selection” tab to start selecting females with breast cancer. Then, I clicked on the “Edit” tab on the far right of the screenshot, as shown below.

After clicking “Edit”, a smaller window will pop up, offering you options to select your group of cancer cases. As shown below, I clicked on the (+) sign besides the folder named “Site and Morphology”, clicked on the “Site recode ICD-O-3/WHO 2008” — which denotes the cancer sites classification that was dictated by the third edition of the International Classification of Diseases in Oncology in association with the World Health Organization (ICD-O-3).

Next, I set the operator to (=), and then I scrolled down within the “Values” menu until I reached “Breast”. Automatically, every selection criterion that I make will be highlighted below in a separate window named “Selection Statement”.

After I was done with this selection criterion, I hit “OK”.

Although I have used this method to select all cases with breast cancer who were diagnosed between 1975 and 2017, I had to specify that I am looking for women with breast cancer, simply because men too can get breast cancer.

This is the “Selection” window after I have made my first selection.

So I had to make another important selection. I had to select females only. I used the same method that I used for the above selection criterion with the exception that I selected the “Sex” variable within the “Race, Sex, Year Dx” folder, as shown also below.

After clicking “OK” after selecting females, the final selection criteria were listed in the “Selection window” as shown in the screenshot below.

After than I switched to the “Table” window to select the primary site of the breast cancer in those women who were diagnosed with breast cancer and see where exactly most of the breast cancer lumps were found.

This is the “Table” widow before choosing the primary site of cancer.

Within the “Table” window, I first chose the variable “primary site- labeled” from within the “Site and Morphology” folder. Next, I clicked on the “Column” tab to create a column within the resulting table with the column title “Primary site- labeled”. I have summarized the table selection process in the screenshot below.

After that, I switched to the “Output” window, where I gave my table a name, and then clicked on the thunder bolt-shaped “Execute” tab to order SEER*Stat to make the data table for me. See the screenshot below for details.

You will then see a temporary window popping up with a loading bar indicating the preparation of your desired data table.

Then, the SEER*Stat-made data table will be displayed as follows.

I can’t make any analysis with this data table as it is. So, I had to transfer this data table to Microsoft Excel. If you are good with using Excel for data analysis, then well and good. If you are not, then you can open this data file using other statistical packages like JMP, R, Stata, etc.

In order to transfer this data table to Excel, simply copy the whole column by clicking over the column title (your mouse cursor will change into a looking-down black arrow). The whole column will be highlighted in black. After that, you can right-click and select “copy” from the drop-down menu.

Once you have copied that column, open Microsoft Excel, and use the “paste” function to paste the data into an Excel table.

For me, I usually like to use the JMP software to analyze data. You can use Excel if you are good at it, or you can use whatever statistical software that you are comfortable with. All you have to do is run a “distribution analysis” for the extracted data describing the primary sites of the breast cancer. Ofcourse, you will have to treat the primary site data as categorical variables.

So, I opened this Excel data table using JMP, ran a distribution analysis, and here was the resulting bar graph.

From the above bar graph, we can see that:

Our group of women with breast cancer who were diagnosed between 1975 and 2017 included 724,205 women.

Breast cancer lumps are usually found in the upper outer quadrant of the breast. The data shows that almost one third of the women diagnosed with breast cancer had their lumps in that site.

If we exclude vague primary sites, such as “Not otherwise specified” (Breast, NOS), or “Overlapping lesion of the breast”, then we can conclude also that the second most common primary site for female breast cancer is the upper inner quadrant (almost 10% of cases), followed by the lower outer quadrant (6.6%), central portion (5.4%), and the lower inner quadrant (5%).



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store