All Services Other Clustering using GPU nearest neighbours $60/hr · Starting at $60 Engineered a GPU-accelerated review-clustering pipeline that generates Azure OpenAI embeddings and retrieves nearest neighbors in PostgreSQL using pgvector’s cosine-distance operator for fast similarity ranking. GitHubCoupled UMAP for non-linear dimensionality reduction with HDBSCAN to discover dense semantic clusters without pre-specifying k, aligned to best-practice usage of both algorithms. UMAP Documentation+1Orchestrated a 72-variant hyperparameter sweep (n_neighbors × min_dist × n_components × min_cluster_size × min_samples) and parallelized evaluation across 8 processes for efficient model selection.Designed an auto-selection score with Min-Max normalization that weights Davies–Bouldin, Silhouette, and Calinski-Harabasz indices and enforces guardrails on valid cluster counts to pick the best iteration. Scikit-learn+1Persisted per-iteration labels, metrics, and configs via SQLAlchemy; materialized the final reviews→cluster mapping in Postgres with safe delete/append semantics for idempotent reruns.Added structured logging (structlog) and timestamped checkpoints, delivering a reproducible, configurable clustering service for large-scale product-review analysis (Python, RAPIDS cuML, PostgreSQL/pgvector, SQLAlchemy, Azure OpenAI, pandas, scikit-learn). About $60/hr · Ongoing Download Resume Engineered a GPU-accelerated review-clustering pipeline that generates Azure OpenAI embeddings and retrieves nearest neighbors in PostgreSQL using pgvector’s cosine-distance operator for fast similarity ranking. GitHubCoupled UMAP for non-linear dimensionality reduction with HDBSCAN to discover dense semantic clusters without pre-specifying k, aligned to best-practice usage of both algorithms. UMAP Documentation+1Orchestrated a 72-variant hyperparameter sweep (n_neighbors × min_dist × n_components × min_cluster_size × min_samples) and parallelized evaluation across 8 processes for efficient model selection.Designed an auto-selection score with Min-Max normalization that weights Davies–Bouldin, Silhouette, and Calinski-Harabasz indices and enforces guardrails on valid cluster counts to pick the best iteration. Scikit-learn+1Persisted per-iteration labels, metrics, and configs via SQLAlchemy; materialized the final reviews→cluster mapping in Postgres with safe delete/append semantics for idempotent reruns.Added structured logging (structlog) and timestamped checkpoints, delivering a reproducible, configurable clustering service for large-scale product-review analysis (Python, RAPIDS cuML, PostgreSQL/pgvector, SQLAlchemy, Azure OpenAI, pandas, scikit-learn). Skills & Expertise Cluster AnalysisCluster ManagementClusteringCumlHdbscanOpenAIUmap 15 Reviews Nanyo says, Helped us resolve our issue for AZURE AKS K10 ingres application gateway on Jul 08, 2024 Dan 54 says, Thanks for all the help on this. Like I said I might be reaching out for more on this in the future. for Amazon api gateway expert on Feb 04, 2023 Dan 54 says, We will be working together in the future. for Amazon api gateway expert on Jan 30, 2023 LSE 1 says, Great working with Chris! Highly recommended! for Cloud Engineer for Blockchain Analytics on Jul 23, 2021 Steve_Garelick says, Great Job and Very fast. for Install OSTicket on Server on Jun 06, 2021 Sign up or Log in to see more. Browse Similar Freelance Experts Software DevelopersComputer ProgrammersPython DevelopersSQL DevelopersAWS DevelopersNode.js DevelopersBig Data DevelopersPerl DevelopersDatabase DevelopersData AnalystsCreative DesignersAnalytics ExpertsCloud Computing EngineersAPI DevelopersPostgreSQL DevelopersMicrosoft Azure Cloud ServicesPower BI DevelopersModule DesignersProgram Managers.Net DevelopersMySQL DevelopersDatabase AdministratorsJava DevelopersJSON Developers