Wednesday, August 23, 2017

Text Mining in Python




https://pythonprogramming.net/text-classification-nltk-tutorial/
http://andybromberg.com/sentiment-analysis-python/
http://sentiment.christopherpotts.net/





Text Mining in R



install.packages( c("tm", "VCorpus", "hunspell"),  dependencies = TRUE)

install.packages( c("Rweka", "RMySQL", "textmining"),  dependencies = TRUE)

install.packages( c("iemisctext", "ngram"), dependencies = TRUE)

if(TRUE)
{
  WORKDIR="D:/DeltaPartners2017/FieldOperations/TextMining/Mining/";
  setwd(WORKDIR);
  Sys.setlocale("LC_ALL","Arabic")
  memory.limit(size = 16072)
 
 
  library(tm);
  library("iemisctext")
  library(ngram)
  library(RWeka)
 
  data(anarchy)
  a <- anarchy="" documenttermmatrix="" p="">
  findFreqTerms(a, 5)
 
  data("crude")
 
  # Corpust --> DataFrame
  ddataframeCrude<-data .frame="" content="" stringsasfactors="F)</p" text="unlist(sapply(crude,">  dataframeCrude[1,]
 
}
 
 
  # Raw data to Corpus

  serv_log <- nbsp="" p="" paste="" read.csv="" sep="" test.csv="" workdir="">  mycorpus <- corpus="" ectorsource="" p="" serv_log=""> 
  mydfCorpus <- data.frame="" stringsasfactors="FALSE)</p" text="unlist(mycorpus)">  mydfCorpus[1]
 
  # Map the corpus by applying differnt filters

  mycorpusAfterMap <- mycorpus="" p="" tm_map="" tolower="">  mycorpusAfterMap <- mycorpusaftermap="" p="" removepunctuation="" tm_map="">  mycorpusAfterMap <- mycorpusaftermap="" p="" removenumbers="" tm_map="">  mycorpusAfterMap <- kind="en" mycorpusaftermap="" p="" removewords="" stopwords="" tm_map="">  mycorpusAfterMap <- mycorpusaftermap="" p="" stripwhitespace="" tm_map="">  mycorpusAfterMap <- mycorpusaftermap="" p="" stemdocument="" tm_map=""> 
  # If apply this, gives error#  mycorpusAfterMap <- laintextdocument="" mycorpusaftermap="" p="" tm_map=""> 
  ########################  Generate DT matrix    #################################
  
myDTmatrix <- documenttermmatrix="" mycorpusaftermap="" p="">  findFreqTerms( myDTmatrix, 5)
  keyword.freq <- appearing="" as.matrix="" frequency="" mydtmatrix="" of="" p="" rowsums="" words=""> 
  myDTmatrix_NonSparse <- 0.9="" mydtmatrix="" p="" removesparseterms="">  keyword_NonSparse.freq <- appearing="" as.matrix="" frequency="" mydtmatrix_nonsparse="" of="" p="" rowsums="" words=""> 
 
  # Find N-gram from corpus using NGRAM package
  
  strCorpus <- 1="" concatenate="" lapply="" mycorpusaftermap="" p="">  bigram <- n="2)</p" ngram="" strcorpus="">  # print(ngram(strCorpus, sep=" " , n=2), output="full")
  get.phrasetable(bigram)
 
  # Find N-gram from corpus using RWeka package
  #mydfCorpus <- data.frame="" stringsasfactors="FALSE)</p" text="unlist(mycorpusAfterMap)">  #bigram <- max="2))</p" min="2," mydfcorpus="" ngramtokenizer="" weka_control="">  

Monday, April 10, 2017

Map with R



How it really works in R ggplot2 


http://eriqande.github.io/rep-res-web/lectures/making-maps-with-R.html

You need to create data in following format:

                long      lat        group order region subregion
#> 1 -101.4078 29.74224     1     1   main      
#> 2 -101.3906 29.74224     1     2   main      
#> 3 -101.3620 29.65056     1     3   main      
#> 4 -101.3505 29.63911     1     4   main      
#> 5 -101.3219 29.63338     1     5   main      
#> 6 -101.3047 29.64484     1     6   main      


Some useful packages


  • ggmap
  • googleVis







Sunday, April 9, 2017

Teradata Tutorial




Import CSV file as a table in Teradata


1. First create the table you want with the structure of the file.

Create table dp_cad_analtyics.A
(
Testcol1 varchar(20)
, Testcol2 varchar(20)
);

2. Then modify the below query accordingly so that the number of question marks are equal to the number of columns

Insert into dp_cad_analytics.A
Values
(?,?);


3. Select the option of "import data" from the "File" menu and execute query. It will ask for the file.




Wednesday, March 22, 2017

SAS tutorial


SAS Tutorial



SAS E-Guide


1. Library Creation

Create a program and write following command:

libname tanvir '/sasdata2/SAS-USERS/taaalam/';





2. Import File into Library

File > Import Data >  Select Local File >  Output SAS data set (  use SAS server and Library name that you created in step 1 )




SAS E-Miner


1. Map Libraries in SAS Miner (Let E-miner knows the library you created in E-Guide)

File > New > Library


2. Bring Files under "Data Source"

Right click "Data Source" > "Create Data Soruce"





Thursday, January 14, 2016

OpenMP Tutorial


Some tutorial:


http://people.math.umass.edu/~johnston/PHI_WG_2014/OpenMPSlides_tamu_sc.pdf

http://www3.nd.edu/~zxu2/acms60212-40212-S12/Lec-11-03.pdf


Scheduling Tutorial

For beginners: https://people.sc.fsu.edu/~jburkardt/c_src/schedule_openmp/schedule_openmp.html

https://www.buffalo.edu/content/www/ccr/support/training-resources/tutorials/advanced-topics--e-g--mpi--gpgpu--openmp--etc--/2011-09---practical-issues-in-openmp--hpc-1-/_jcr_content/par/download/file.res/omp-II-handout-2x2.pdf

Fibonacci Number generation


https://gist.github.com/CarlEkerot/2601195

Monday, October 19, 2015

SVN tutorial




SVN cheat sheet:

http://www.cheatography.com/davechild/cheat-sheets/subversion/

For more useful svn commands, check the following tutorial:
http://openoffice.apache.org/svn-basics.html



Server : Create a  SVN repository


svn co https://svn.apache.org/repos/asf/openoffice/trunk  mySVNrepository

Local machine: Make a repository and connect with server


Just on your local machine, run the following command:

Then, all current testing scripts will be added to your local machine. 

Add and commit your changes to server:


In order to share with us your work, inside your own folder (e.g. ChallengeScripts/Wail ), create any folder, file or scripts and do:
1- svn add folder1
2- svn commit folder1 -m "Comments of your changes"