Cell killing plot pipeline by jaceybronte · Pull Request #79 · WayScience/gene-process-dependencies

jaceybronte · 2026-07-01T16:39:17Z

This PR processes our collaborator data and provides latent scores for cell killing comparisons.

review-notebook-app · 2026-07-01T16:39:22Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

MikeLippincott

LGTM, some efficiency things and other concerns with large notebooks and separation of concerns, but overall looks good!

MikeLippincott · 2026-07-01T16:53:10Z

+# Calculate the Euclidean distance for each row from the mean values
+distances = np.linalg.norm(data - mean_values, axis=1)
+
+# Create a new DataFrame to store distances with SampleID
+new_rnaseq_data['Euclidean_Distance'] = distances


Consider merging these together to avoid multiple var calls

MikeLippincott · 2026-07-01T16:53:46Z

+# Print the SampleID and corresponding Euclidean Distance for each row
+for idx, row in new_rnaseq_data.iterrows():
+    print(f"SampleID: {idx}, Euclidean Distance: {row['Euclidean_Distance']}")
+


Consider adding this to a log or only printing a few rows to avoid gunking up the stdout

MikeLippincott · 2026-07-01T16:54:43Z

+    latent_df = pd.DataFrame(latent_predictions, columns=["latent_score"])
+
+
+    print(latent_predictions)


same here for the printing!

MikeLippincott · 2026-07-01T16:55:29Z

+collab_preds_dir = pathlib.Path("../7.collab-data/results").resolve()
+collab_preds_dir.mkdir(parents=True, exist_ok=True)
+
+latent_pred_file = collab_preds_dir / "phgg_latent_predictions.parquet"


consider moving this to the top of the notebook

MikeLippincott · 2026-07-01T16:55:46Z

+# In[9]:
+
+
+# Define the location of the saved models and output directory for results


MikeLippincott · 2026-07-01T16:59:38Z

+overall_counts["percent"] = overall_counts["count"] / total_modelids * 100
+
+# 2. Subset for brain tumors: Neuroblastoma and Diffuse Glioma
+brain_df = total_drugs[total_drugs["OncotreePrimaryDisease"].isin(["Neuroblastoma", "Diffuse Glioma"])]


are these the only brain tumors or only the ones you are interested in?

MikeLippincott · 2026-07-01T17:01:36Z

+    df = compute_and_plot_latent_scores(sample, latent_df, drug_max, "name", "pearson_correlation", "Drug")
+    drug_merge_df.append(df)


consider adding this to one line to avoid writing the df in memory and then in the list, write once to avoid mem leaks

MikeLippincott · 2026-07-01T17:01:50Z

+    p_df = compute_and_plot_latent_scores(sample, latent_df, reactome_max, "reactome_pathway", "nes_score", "Reactome")
+    c_df = compute_and_plot_latent_scores(sample, latent_df, corum_max, "reactome_pathway", "nes_score", "CORUM")
+    pathway_merge_df.append(p_df)
+    corum_merge_df.append(c_df)


see mem leak comment below

MikeLippincott · 2026-07-01T17:04:01Z

+# In[5]:
+
+
+cell_killing_df <- auc_df


Consider avoid these rename calls and name the df when created

MikeLippincott · 2026-07-01T17:06:38Z

Nice plots here!

Cell killing plot pipeline

1f3f40f

jaceybronte requested a review from MikeLippincott July 1, 2026 16:40

MikeLippincott approved these changes Jul 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cell killing plot pipeline#79

Cell killing plot pipeline#79
jaceybronte wants to merge 1 commit into
WayScience:mainfrom
jaceybronte:cell-killing-plots

jaceybronte commented Jul 1, 2026

Uh oh!

review-notebook-app Bot commented Jul 1, 2026

Uh oh!

MikeLippincott left a comment

Uh oh!

MikeLippincott Jul 1, 2026

Uh oh!

MikeLippincott Jul 1, 2026

Uh oh!

MikeLippincott Jul 1, 2026

Uh oh!

MikeLippincott Jul 1, 2026

Uh oh!

MikeLippincott Jul 1, 2026

Uh oh!

MikeLippincott Jul 1, 2026

Uh oh!

MikeLippincott Jul 1, 2026

Uh oh!

MikeLippincott Jul 1, 2026

Uh oh!

MikeLippincott Jul 1, 2026

Uh oh!

MikeLippincott Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		latent_df = pd.DataFrame(latent_predictions, columns=["latent_score"])


		print(latent_predictions)

		# In[9]:


		# Define the location of the saved models and output directory for results

		df = compute_and_plot_latent_scores(sample, latent_df, drug_max, "name", "pearson_correlation", "Drug")
		drug_merge_df.append(df)

Uh oh!

Conversation

jaceybronte commented Jul 1, 2026

Uh oh!

review-notebook-app Bot commented Jul 1, 2026

Uh oh!

MikeLippincott left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants