Posts

AIML - Recommendation Systems

 Two techniques used Content Based Filterting Based on content recommendation is given to similar users If user1 watches Action & Adventure movies, similarly user2 also sees same. Then new movies watched by User1 will be suggested to User2 based on category of movie.  eg: amazon online shopping Collaborative based filtering identifies behavior of user and categorizes users accordingly. New movies will be suggested based on the other users in group irrespective of the genre/category eg: Netflix, Amazon prime Movie genre(Action, Comedy, Adventure, Romantic) Content based recommendation system design - Movie recommendation system Import data from tmdb(movie details with overview) from sklearn.feature_extraction.text import TfidfVectorizer remove stop words, special chars, remove nan value with blanks fit transform on movies OVERVIEW field to get sparse matrix from sklearn.feature_extraction.text import sigmoid_kernel Note: sigmoid transforms input between 0 t...

KNN - K Nearest neighbours

Machine Learning models make predictions from the past data available. KNN is one of the simplest Supervised ML algorithm mostly used for Classification. It classifies based on how its neighbors are classified. KNN stores available cases and classifies new cases based on similarity measure Choice of K - sqrt(N)  Lazy learning algorithm How does (kids) teach kids to learn differentiate between cat & dog:  type of claws, ear length, sound(bark vs meow), plays around vs not kids identify given animal based on feature classification Uses Cases: Recommended systems- biggest use case in real-time online shopping, OTT platforms, advertisement  Content Search- documents having similar topics from billions of documents Image & video recolonization Height, weight -> derive T shirt size  Predict dog category Predict Over weight or not based on height & weight   Predict Over Diabetes  Pregnancies, Glucose, BP, Skin thickness, insulin, BMI, diabetes pede...

DB-REST access

 ACL(Access control list) ORDS(Oracle REST Data Service) It is a list of access control entries to restrict the hosts that are allowed to connect to the Oracle database. ACLs are created using dbms_network_acl_admin and dbms_network_acl_utility packages.  Calling WebServices from Oracle PLSQL:   Links https://www.youtube.com/watch?v=avBgGPw0_sA&t=16s  https://slobaray.com/2015/02/05/calling-web-services-from-oracle-plsql/ http://www.dba-oracle.com/t_advanced_utl_http_package.htm  https://gist.github.com/ser1zw/3757715 https://www.oradev.com/utlhttp.html https://docs.oracle.com/cd/F49540_01/DOC/server.815/a68001/utl_http.htm In Oracle we have a package called UTL_HTTP. By using this package we can call web services using a POST method and get a response from it. To use any web services we need to register the URL in our DB by assigning it to the Access Control List (ACL). Here is the below code which need to be executed under SYS user, so that we can ut...

Deep Learning

ANN CNN RNN Back Propogation  

Natural Language Processing

Uses Text Classification Used for filtering information in web search Helps to avoid spam mail  Sentiment Analysis Identify opinions & sentiments of audience Chatbots Used for customer support Used in HR systems Used in e-commerce systems Customer service Insights into audience preferences Helps improve customer satisfaction' Advertisement Helps target right customers Tokenization Process of breaking up text into smaller pieces(tokens). Token can be word or a sentence Stop Words an, a, when - which doesn't convey actual meaning Part of speech (POS) Tags nouns, verb, adjectives etc Stemming Process of reducing or root of the word or taking the stem Lemmatization  Process of reducing or root of the word or taking the stem in dictionary form Named Entity Recognition Recognize entities like People, Organizatio, places etc Bag of words covert to lower case perform stemming and lemmatization remove stop ...

Statistics in Machine Learning

Note : In life cycle of data science more then 60% of time goes into data analysis like feature engineering, feature selection.  Feature engineering means Cleaning data, handle missing values, unbalanced data, category features.   Python packages Pandas - read dataset read_csv, head, isnull, getdummies, drop, concat, Numpy - work with arrays matplotlib.pyplot - for visualization Seaborn - for visualization heatmap, countplot, boxplot, Handling Categorical features one hot encoding for nominal variables label encoding for ordinal variables Ways for finding Outliers Scatter plot Box plot  z-score IQR Correlation :  Strength of association between two variables  This is both ways  A and B = B and A Regression If one of the variable is dependent & other is independent variable Regression equation = Average value of 'y' is a function of x R Square Significance of F & P values     Covariance(cov) Quantify relationship between features, rand...

Decision Tree

Decision Tress are also called as Objective Segmentation. Decision Tree vs Logistic Regression Segmentation  all nodes are same Logistic regression Each account is given a score, ranking among them and so more granular