Initial website optimization & first post added (see: https://github.… by kamilpytlak · Pull Request #6 · ttscience/ttscience_blog

kamilpytlak · 2024-10-01T11:40:05Z

netlify · 2024-10-01T11:40:26Z

❌ Deploy Preview for ttscienceblog failed.

⚠️ Continuous deployment needs attention — organization-owned private repository detected.
Upgrade to Pro or change repository settings in order to continue deploying from this repository.
For more information, visit the deploy log and FAQ page.

Name	Link
🔨 Latest commit
🔍 Latest deploy log	https://app.netlify.com/sites/ttscienceblog/deploys/67407def3c5895917aff73f1

…corrections in the post about t-SNE

salatak

I like the way the user is guided through the topic, but I would expect a clearer indication of the benefits that come from using the proposed solution.

salatak · 2024-10-18T14:10:12Z

config.yaml

-    Title: "Hi there 👋"
-    Content: "Welcome to the TT Science Development Blog! Here, you'll find insights, tutorials, and updates about our work in clinical data science, including R, Python, and optimization techniques. Explore our posts, check out our projects on GitHub, or learn more about our team."
+    Title: "Hey there, data enthusiasts! 👋"
+    Content: "Here at TT Science, we are a dynamic team of individuals — like trees in a [random forest](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html) 🌳🌲 — each bringing our own unique strengths and perspectives to the table. While we share a common goal of advancing clinical data science, it’s our diversity that makes us truly special! <br><br> 🌈 Join us on this exciting journey 🚀 as we share insights, tutorials, and updates on our innovative projects involving R, Python, and biostatistics. 💻🔍 Explore our posts, dive into our GitHub projects 📂, and get to know the passionate minds behind our work. Together, we’re harnessing the power of data to create impactful solutions in healthcare. 💡💖 Welcome to our forest of knowledge! 🌳🌟"


The intro sounds pretty good, but we're overusing emoticons. It makes it look a bit childish and overwhelming at the same time. I think it's better to aim for a 'casual elegant' tone in writing. I like the second part of the intro the most, and I'd consider tweaking the first part, although it's not essential.

Also we have already "about us" + "authors" - plenty of places where we talk "about us". My proposition: let's revert to old version where this was just a friendly waving hand.

What do you think of such a variation (without “Content” is left a little blank)?:

config.yaml

salatak · 2024-10-18T14:18:32Z

content/about.md

+We’re writing here as practitioners in clinical statistics, data science, and machine learning. Our team consists of Biostatisticians, Data Scientists, Bioinformaticians and Developers. We’re not marketers—we’re the ones in the trenches, working with data every day to solve real-world problems. We support researchers, scientists, and hospitals in medical and life sciences research projects by delivering relevant documentation, software, and statistical reports, utilizing programming languages such as R and Python.
+
+“Data for good” is in our DNA. We’re passionate about doing meaningful work, which includes engaging in open-source projects. We don’t just crunch numbers; we aim to make a positive impact through our expertise. Our main goals include supporting investigators in the planning, execution, and finalization of clinical trials—covering tasks such as sample size calculations, study design, developing clinical trial protocols in compliance with EMA and FDA guidelines, creating Statistical Analysis Plans (SAPs), validating data, and generating statistical reports — as well as developing dedicated software, including applications such as to facilitate the analysis and validation of medical data.  
+
+We believe in science and want innovative scientists to tackle the questions and challenges of today’s world. We want scientists to always be able to rely on data—because we all rely on them. Our partnership approach is rooted in the scientific method—we combine data, our clients’ expertise, and our competencies to achieve the best results. 
+
+Our mission is to share knowledge with the broader community engaged in medical IT software development, clinical research, and the analysis of medical and biological data, as well as those just beginning their careers in these fields. Through our published content, we aim to showcase solutions for software development and provide guidance to researchers, particularly in the areas of statistics, machine learning, and the regulatory frameworks of the European Medicines Agency (EMA) and the U.S. Food and Drug Administration (FDA).
+
+If you want to check our official page, visit Transition Technologies Science—you’ll learn that we provide software, data science services, and offer analytical support at every stage of clinical trials.
+
+Reach out to us to explore the world of data - our email is office@ttsi.com.pl


The about me section is usually a bit more concise. Maybe we could shorten it and add some photos from conferences or our team-building events? I think if people see a wall of text, they might get discouraged. We don’t want an overly serious blog, but rather a professional one with a relaxed tone.

The text itself is good, but I would place the third and fourth paragraphs as a welcome post that outlines our intentions. In the "about me" section, I would keep the focus on us and our work, rather than what’s on the blog.

@aleksandraduda2

I would leave main page empty, without "about us" or just one sentence. I agree with shortening - removing 4th paragraph is fine.

...same-but-different-how-advanced-data-science-techniques-help-us-validate-drug-names/index.md

… review)

kamilsi

Great work on this article—it’s thorough and informative! I have a few suggestions to refine it further:

Tone: Since we’re aiming for a 'serious friend' tone, I’d recommend avoiding casual language, such as in some headers (e.g., 'Pharmaceutical Gig'). Keeping a professional yet approachable tone would better suit our goals.
Focus: To keep the article sharp and engaging, I’d suggest cutting out anything that isn’t directly about fixing typos in drug names. For example, the introduce_variation function for generating typos feels tangential and might distract readers from the main topic.
Streamlining: There are a few opportunities to streamline the content:
- Avoid repetitions of ideas already covered, like the challenges with inconsistent drug names.
- Minimize code output, such as printing entire data frames or diagnostics, which can overwhelm readers. Summarizing key outputs or providing representative samples instead would keep the focus tight and improve readability.

These changes could make the article clearer, more concise, and more aligned with its purpose. Let me know what you think!

kamilsi · 2024-11-22T13:30:05Z

.Rprofile

@@ -1,3 +1,4 @@
+source("renv/activate.R")


I cannot work on my laptop already - some C compilation issues :-/ Probably renv was a good idea, but it's going to be complicated.

kamilsi · 2024-11-22T13:31:28Z

assets/css/common/post-single.css


 .post-content {
    color: var(--content);
+    text-align: justify;


I wouldn't say I like justified text aesthetically (especially on webpages, docs are OK). @salatak what do you think?

Comparison:

vs.

Please decide, it's all the same to me.

kamilsi · 2024-11-22T15:14:54Z

...same-but-different-how-advanced-data-science-techniques-help-us-validate-drug-names/index.md

+categories:
+  - machine learning
+  - R
+  - statistics
+  - text analysis
+tags:
+  - data visualization
+  - drug names
+  - eCRF
+  - data validation
+  - levenshtein distance
+  - NLP
+  - t-SNE


What is the difference between tags and categories? I think only one of these is presented.

kamilsi · 2024-11-22T15:15:20Z

...same-but-different-how-advanced-data-science-techniques-help-us-validate-drug-names/index.md

+  - NLP
+  - t-SNE
+slug: same-same-but-different-how-advanced-data-science-techniques-help-us-validate-drug-names
+ShowToc: yes


It's a very long post if you need a ToC :-)

...same-but-different-how-advanced-data-science-techniques-help-us-validate-drug-names/index.md

kamilsi · 2024-11-22T15:29:44Z

...same-but-different-how-advanced-data-science-techniques-help-us-validate-drug-names/index.md

+```
+##   [1] "acetylsalicylic acid and corticosteroids"      
+##   [2] "aluminium preparations"                        
+##   [3] "aminophylline"                                 
+##   [4] "amphotericin B"                                
+##   [5] "antazoline"                                    
+##   [6] "artesunate and amodiaquine"                    
+##   [7] "azacitidine"                                   
+##   [8] "benazepril and amlodipine"                     
+##   [9] "benzocaine"                                    
+##  [10] "benzoyl peroxide"                              
+##  [11] "betaine hydrochloride"                         
+##  [12] "betamethasone"                                 
+##  [13] "betaxolol, combinations"                       
+##  [14] "bexagliflozin"                                 
+##  [15] "biperiden"                                     
+##  [16] "bupivacaine and meloxicam"                     
+##  [17] "buspirone"                                     
+##  [18] "calcium lactate"                               
+##  [19] "calcium lactate gluconate"                     
+##  [20] "captopril"                                     
+##  [21] "carumonam"                                     
+##  [22] "casopitant"                                    
+##  [23] "cefapirin"                                     
+##  [24] "ceftibuten"                                    
+##  [25] "chymopapain"                                   
+##  [26] "clotiazepam"                                   
+##  [27] "cyanocobalamin"                                
+##  [28] "desonide and antiseptics"                      
+##  [29] "dexamethasone and antiinfectives"              
+##  [30] "difluprednate"                                 
+##  [31] "digitalis leaves"                              
+##  [32] "diisopromine"                                  
+##  [33] "eosin"                                         
+##  [34] "epinastine"                                    
+##  [35] "eplontersen"                                   
+##  [36] "eptifibatide"                                  
+##  [37] "ferric acetyl transferrin"                     
+##  [38] "fluciclovine (18F)"                            
+##  [39] "flumetasone"                                   
+##  [40] "fluorouracil, combinations"                    
+##  [41] "flutrimazole"                                  
+##  [42] "folic acid"                                    
+##  [43] "fostemsavir"                                   
+##  [44] "gatifloxacin"                                  
+##  [45] "gefarnate, combinations with psycholeptics"    
+##  [46] "histapyrrodine, combinations"                  
+##  [47] "Hyperici herba"                                
+##  [48] "idrocilamide"                                  
+##  [49] "indometacin, combinations"                     
+##  [50] "iodine iofetamine (123I)"                      
+##  [51] "isoprenaline"                                  
+##  [52] "istradefylline"                                
+##  [53] "kanamycin"                                     
+##  [54] "lactulose"                                     
+##  [55] "levodopa"                                      
+##  [56] "levonorgestrel"                                
+##  [57] "lincomycin"                                    
+##  [58] "magnesium carbonate"                           
+##  [59] "mecasermin"                                    
+##  [60] "megestrol and estrogen"                        
+##  [61] "menadione"                                     
+##  [62] "methaqualone"                                  
+##  [63] "micafungin"                                    
+##  [64] "moexipril and diuretics"                       
+##  [65] "narcobarbital"                                 
+##  [66] "nebivolol and amlodipine"                      
+##  [67] "nimetazepam"                                   
+##  [68] "odevixibat"                                    
+##  [69] "pegloticase"                                   
+##  [70] "perphenazine"                                  
+##  [71] "pethidine"                                     
+##  [72] "phenylephrine"                                 
+##  [73] "pipotiazine"                                   
+##  [74] "pirprofen"                                     
+##  [75] "plerixafor"                                    
+##  [76] "potassium citrate"                             
+##  [77] "prazosin"                                      
+##  [78] "prednisone"                                    
+##  [79] "remoxipride"                                   
+##  [80] "reteplase"                                     
+##  [81] "rifamycin"                                     
+##  [82] "rivastigmine"                                  
+##  [83] "roxithromycin"                                 
+##  [84] "salsalate"                                     
+##  [85] "sorbitol"                                      
+##  [86] "streptokinase"                                 
+##  [87] "succinimide"                                   
+##  [88] "taurolidine"                                   
+##  [89] "technetium (99mTc) pertechnetate"              
+##  [90] "teneligliptin"                                 
+##  [91] "theophylline, combinations excl. psycholeptics"
+##  [92] "ticarcillin"                                   
+##  [93] "tiemonium iodide and analgesics"               
+##  [94] "timolol, thiazides and other diuretics"        
+##  [95] "tolperisone"                                   
+##  [96] "tramadol"                                      
+##  [97] "tretoquinol"                                   
+##  [98] "trypsin, combinations"                         
+##  [99] "ursodoxicoltaurine"                            
+## [100] "zidovudine"
+```


Feels a bit redundant, as the structure of the codebook has already been presented earlier. Listing all the names in full doesn’t seem to add much new information and might distract the reader.

kamilsi · 2024-11-22T15:32:46Z

...same-but-different-how-advanced-data-science-techniques-help-us-validate-drug-names/index.md

+We're looking to add a bit of confusion to our drug names, so we've created a function called `introduce_variation`. It takes a name and returns a new version with a duplicate, deleted, or rearranged character.
+
+
+``` r
+introduce_variation <- function(name) {
+  # Randomly choose a type of modification to introduce a typo
+  modification <- sample(c("duplicate", "remove", "swap"), 1)
+  name_chars <- unlist(strsplit(name, ""))
+
+  if (modification == "duplicate") {
+    # Duplicate a random character
+    duplicate_pos <- sample(1:length(name_chars), 1)
+    name_chars <- append(name_chars, name_chars[duplicate_pos], after = duplicate_pos)
+  } else if (modification == "remove") {
+    # Remove a random character
+    remove_pos <- sample(1:length(name_chars), 1)
+    name_chars <- name_chars[-remove_pos]
+  } else if (modification == "swap") {
+    # Swap two adjacent characters
+    swap_pos <- sample(1:(length(name_chars) - 1), 1)
+    temp <- name_chars[swap_pos]
+    name_chars[swap_pos] <- name_chars[swap_pos + 1]
+    name_chars[swap_pos + 1] <- temp
+  }
+
+  return(paste(name_chars, collapse = ""))
+}
+```


The introduce_variation function adds complexity that might not be necessary for this article. Since the focus is on fixing errors in drug names, introducing random typos as part of the workflow could confuse readers. Pre-preparing a small set of intentional errors and using them consistently would simplify the explanation and keep the focus on error correction rather than error creation.

kamilsi · 2024-11-22T15:33:20Z

...same-but-different-how-advanced-data-science-techniques-help-us-validate-drug-names/index.md

+# Add additional drug names in Polish
+complete_drug_names <- c(complete_drug_names, c("Kwas acetylosalicylowy i kortykosteroidy", # Acetylsalicylic acid and corticosteroids
+                                                "Węglan magnezu", # Magnesium carbonate
+                                                "Kwas foliowy" # Folic acid
+))
+
+unique(complete_drug_names) |> sort() |> head(10)
+```


Polish is confusing

kamilsi · 2024-11-22T15:34:05Z

...same-but-different-how-advanced-data-science-techniques-help-us-validate-drug-names/index.md

+```
+## Perplexity: 2 | KL Divergence: 1.365567 
+## Best Perplexity So Far: 2 | Best KL Divergence: 1.365567 
+## 
+## Perplexity: 3 | KL Divergence: 1.467266 
+## Best Perplexity So Far: 2 | Best KL Divergence: 1.365567 
+## 
+## Perplexity: 4 | KL Divergence: 1.605605 
+## Best Perplexity So Far: 2 | Best KL Divergence: 1.365567 
+## 
+## Perplexity: 5 | KL Divergence: 1.612329 
+## Best Perplexity So Far: 2 | Best KL Divergence: 1.365567 
+## 
+## Perplexity: 6 | KL Divergence: 1.841238 


I would skip this, doesn't add much

kamilsi · 2024-11-22T15:34:22Z

...same-but-different-how-advanced-data-science-techniques-help-us-validate-drug-names/index.md

+  theme_bw()
+```
+
+<img src="{{< blogdown/postref >}}index_files/figure-html/unnamed-chunk-11-1.png" width="672" />


better for SEO to name chunks

Initial website optimization & first post added (see: #2)

393ee59

kamilpytlak requested a review from kamilsi October 1, 2024 11:40

kamilsi and others added 4 commits October 2, 2024 16:32

About as + some suggestions

7074588

Updated Hugo to 0.135.0 & added "authors-under-post" functionality & …

faf4ff0

…corrections in the post about t-SNE

CTA implemented

9ea1162

Expanding about us sections

4f73a03

kamilpytlak requested review from salatak and removed request for kamilsi October 8, 2024 14:03

salatak requested changes Oct 18, 2024

View reviewed changes

kamilpytlak added 3 commits October 19, 2024 15:20

Changed favicon to company logo + updated code in post (after Kinga's…

e41dc11

… review)

Added renv

1540722

Removed block with loading of libraries

9074753

kamilsi reviewed Nov 22, 2024

View reviewed changes

Conversation

kamilpytlak commented Oct 1, 2024

Uh oh!

netlify bot commented Oct 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❌ Deploy Preview for ttscienceblog failed.

Uh oh!

salatak left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kamilsi Nov 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kamilsi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

netlify bot commented Oct 1, 2024 •

edited

Loading

kamilsi Nov 22, 2024 •

edited

Loading