Friday, October 22, 2021

Scala Advanced

Scala Advanced

Generics

trait A
class B extends A
class C extends B
object LowerBoundGeneric extends App {
  class Test[A >: B](val x: A) //Can have of type A and B not C
  val temp = new B() // new C() = Fail
  val test: Test[B] = new Test[B](temp)
}
object CovariantGeneric extends App {
  class Test2[+A]{ def run[B >: A](element: B)=print("working") }
  val temp2 =new C() // new C() = Fail
  new Test2[B]().run(temp2)
}

Apply

//whereby the compiler converts f(a) into f.apply(a)
object Applytest extends App{
  class Foo(x: Int) { def apply(y: Int) =x+y}
  val f = new Foo(3)
  println(f(4))  // returns 25
}

Partial Function

/*
function  is f: X -> Y,
A partial function =  Does not force f to map every element of X to an element of Y
ie., several subpartial function to handle differnt elements in same data set
new PartialFunction[input , output]
if "isDefined" is true than execute "apply"
orElse, andthen
 */
object Partialtest extends App{
  val sample = 1 to 5
    val isEven = new PartialFunction[Int, String] {
      def apply(x: Int) = x + " is even"
      def isDefinedAt(x: Int) = (x != 0 && x%2 == 0)
    }
  val isOdd: PartialFunction[Int, String] = {
       case x if x % 2 == 1 => x + " is odd"
    }
  val evenNumbers = sample map (isEven orElse isOdd)
  print(evenNumbers)
}

Companion Object

/*
Companion object and its class can access each other’s private members (fields and methods)
Have same name
Same file
 */
object CompanionTest extends App{
  class Person {var name = ""}
  object Person {
    def apply(name: String): Person = {
      var p = new Person()
      p.name = name
      p
    }
  }
  print(Person("Fred Flinstone").name) //Person.apply("Fred Flinstone").
}


Future

/*
Anything inside Future {}, is run in a different thread
Application’s main thread doesn’t stop for Future to Complete
Result of Future is always  Try types: Success or Failure
To make main thread wait scala.concurrent.Await.result(future,15.seconds) is used
isComplete , value ,map , collect
 */
object FutureTest extends App{
  import scala.concurrent.Future
  import scala.concurrent.ExecutionContext.Implicits.global
  import scala.util.{Failure, Success}
  val f1:Future[Int] = Future { Thread.sleep(1000); 21 + 21 }
  while(f1.isCompleted!=true){println("future operation completed ?? -  "+f1.isCompleted)}
  println(f1.value)
  val f2:Future[Int]=f1.map(i => i+1)
  f2.onComplete {
    case Success(value) => println(s"Got the callback, value = $value")
    case Failure(e) => e.printStackTrace
  }
}



Implicit

object ImplicitTest extends App{
  case class Person(name: String) {def greet = println(s"Hi, my name is $name")}
  implicit def fromStringToPerson(name: String) = Person(name)
  "Peter".greet
}

Thursday, October 14, 2021

IBMCLOUD

 IBMCLOUD

Index:

  1. Basics
  2. Pre-Req
  3. Free CommandLine Tool
  4. Create Free Application
  5. API Keys
  6. Getting oAuth Tokens
    1. Standalone
    2. Ibm CLI tool
  7. Create AI application
  8. Cloudant Database
    1. Fetch the Clouddant Document from API
  9. Functions
  10. API GateWay
  11. Simple ETL from COS to DB2
  12. Copy ETL using REST
  13. Run Spark Job on COS 

Basics

  • IAM = Shared Account
  • Provisioning= Create an App
  • Helm Charts = Add Addons to the Provisioned App
  • There are 3 Types of App
    • Classic Infrastructure  - For Individuals
    • IAM Managed Services - For Enterprise / Resource Groups 
    • Cloud Foundary - Open Source 

Pre-Req

  • open ibmcloud
  • create a free account
  • Login as directed

Free CommandLine with python3.8+

  • Login to ibmcloud
  • On the tool bar of Landing Page, Click on IBMCloud Shell
  • $python3

Create Free Application

  • Login to ibmcloud
  • click on Catalog
  • Search for CloudFoundary
  • Click on Cloud Foundary Application >Click on Create
  • Add details : Resource ,App Name etc., 
  • Click on Create 
  • Goto homepage > Resource List > CloudFoundaryApp > Click on the app 
  • Click on link Visit app URL

API Keys

Getting oAuth Tokens 


1) Standalone installer (https://cloud.ibm.com/docs/cli?topic=cli-getting-started)

  • Run $curl -fsSL https://clis.cloud.ibm.com/install/linux | sh #Linux
  • ibmcloud login #ibmcloud login --sso
  • ibmcloud iam oauth-tokens
  • copy the result
  • SET IAM_TOKEN=<paste here>
  • Use "Authorization: Bearer IAM_TOKEN"

2) IBMCLOUD CLI

  • Log in to IBM Cloud 
  • select Manage > Security > Platform API Keys.
  • Create an API key for your own personal identity, 
  • copy the value
  • Run below
    $curl -X POST 'https://iam.cloud.ibm.com/identity/token' \
    -H 'Content-Type: application/x-www-form-urlencoded' \
    -d 'grant_type=urn:ibm:params:oauth:grant-type:apikey&apikey=<MY_APIKEY>'/

Response :
 {
        "access_token": "eyJraWQiOiIyMDExxxxxxxxxxx
  • copy access token and use as below
  • Syntax-
    • Authorization: Bearer <access_token_value_here>. 
  • example-
    • Authorization: Bearer eyJraWQiOiIyMDE3MDgwOS0wMDoxxxxxxxxx

Create a AI Application - Language Translator

  • Login to ibmcloud
  • goto to Catalog
  • filter :Pricing plan=lite 
  • Category : Select AI /MAchine Learning
  • Click on Language Translator 
  • Create
  • Check on consent on Agreement 
  • Create
  • Copy the api-key . url  under : Language Translator >Service Credentials
  • Replace api-key and url  (More REST calls : Language Translator >GettingStarted)
curl -X POST --user "apikey:{apikey}" \ --header "Content-Type: text/plain" \ --data "Language Translator translates text from one language to another"
"{url}/v3/identify?version=2018-05-01" 
  • open Ibmcloud Shell from the ibmcloud tool bar
  • Run the new Command

Cloudant Database 

  • Login to IBMCloud
  • Goto Catalog
  • Select and Create a Cloudant Instance
  • Open the Cloudant Instance provisioned from Resource List > Services and Software >Cloudant
  • Click on Manage > Launch Dashboard
  • Create Database > test >Click on Create
  • Open test DB > Design Document > New Doc > add new json key value 
eg:
{
  "_id": "ce9575de70477c932e222bf5b6bd7fea",
  "name": "deepak"
}
  • Click on Create Document

Lets fetch this document from API 

  • Under Cloudant page > Service Credentails > Create New Role > Manager >Add
  • Open the New Service Credentails Created , Note down apikey , url
  • Open ibmcli  from ibmcloud tool bar (https://cloud.ibm.com/docs/account?topic=account-iamtoken_from_apikey&interface=api)
  • $curl -X POST 'https://iam.cloud.ibm.com/identity/token' -H 'Content-Type: application/x-www-form-urlencoded' -d 'grant_type=urn:ibm:params:oauth:grant-type:apikey&apikey=<MY_APIKEY>'
  • Copy the Token generated
  • Run below commands
API_BEARER_TOKEN=<paste token here>
curl -H "Authorization: Bearer $API_BEARER_TOKEN" -X GET "{url}/test/{_id from cloudant}"

Other Api:

curl -H "Authorization: Bearer $API_BEARER_TOKEN" -X PUT /{db}" #Create DB
curl -H "Authorization: Bearer $API_BEARER_TOKEN" -X PUT /{db}/{doc_id}" Create Document
curl -H "Authorization: Bearer $API_BEARER_TOKEN" -X GET "{url}/test/{_id from cloudant}" #Read Document

Ref : 

https://cloud.ibm.com/docs/account?topic=account-iamtoken_from_apikey&interface=api
https://cloud.ibm.com/docs/Cloudant
https://cloud.ibm.com/apidocs/cloudant#getdocument

Functions

  • Login to IBMCloud
  • catalog > search and click Functions
  • Click on StartCreating
  • Select QuickStart templates > Hello World
  • select python3 > clk Deploy
Note:
TO modify the python code: Function/Actions/helloworld

Test1:

  • click Invoke:Result - {"greeting": "Hello stranger!"}
  • click Invoke with parameters:{"name":"deepak"}
  • click Invoke :Result- {"greeting": "Hello deepak!"}

Test2

  • Open ibmCloud Cli
  • curl -u xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx \
  • -X POST https://eu-gb.functions.cloud.ibm.com/api/v1/namespaces/j.thepac%40gmail.com_dev/actions/hello-world/helloworld?blocking=true

Test3

Open ibmcloudcli
$python3    #open pythonshell
url="https://eu-gb.functions.cloud.ibm.com/api/v1/namespaces/j.thepac%40gmail.com_dev/actions/hello-world/helloworld?blocking=true"
auth=("xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx","xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx")
data={"name":"deepak"}
r=requests.post(url,json=data,auth=auth,verify=False)
r.content

API GateWay (Proxy) :

U can create a proxy link for "https://eu-gb.functions.cloud.ibm.com/api/v1/namespaces/j.thepac%40gmail.com_dev/actions/hello-world/helloworld?blocking=true" link by Creating ApiGateWay and providing the above url .

Simple ETL from COS to DB2

Pre- Req:

DB2:

  • Make sure u have created a DB2 instance in IBMCLoud
  • Create a table in DB2 (do not insert any records)
  • CREATE TABLE table_name (col1 int, col1 varchar(255)); -- successfully created
  • In Db2 Ui > Data icon >  Tables 
  • Click on the scheme
  • check if the table is created
    • Test it
      • Syntax : Select * from scheme.table;
      • Example:Select * from DXC02390.table_name;
  • note down the Scheme name and table name
  • Click on about icon in DB2 UI 
  • Note down from "<crn ..........::>" 

Cloudant:

  • Create a Cloudant Object Storage (COS) in IBM Cloud 
  • Create a Bucket 
  • Add a parq File , with scheme similar to the above Table created (use apache spark to create the file locally and drag and drop)
  • select the uploaded parq file > Object Details > copy Object SQL URL

Steps:

  • Create SQL Query instance in ibmcloud 
  • Run the below command to copy the data from COS to DB2
Syntax :
SELECT * FROM <Object SQL URL>  STORED AS PARQUET INTO crn:xxxxxxx:/scheme.table PARALLELISM 2

Example:
SELECT * FROM cos://jp-tok/cloud-object-storage-7d-cos-standard-gsi/test2Cols.parquet STORED AS PARQUET
INTO 
crn:v1:bluemix:public:dashdb-for-transactions:eu-gb:a/e31b7085afca4ab8b6ac9b1077cd8af9:9257e5bc-49f0-43a1-b776-f7a0ff41b2b6::/DXC02390.MONOREPO_POC PARALLELISM 2

Copy ETL using REST 

Pre-Req:  Simple ETL from COS to DB2

curl -X POST 'https://iam.cloud.ibm.com/identity/token' \
    -H 'Content-Type: application/x-www-form-urlencoded' \
    -d 'grant_type=urn:ibm:params:oauth:grant-type:apikey&apikey={Create APi Key from Manage >Access>Api keys}'

Copy Response Token and save it to 
API_TOKEN = "xxxxxx"
or 
SET API_TOKEN="xxxxxx"

Get Current Jobs

curl -XGET   \
--url "https://api.sql-query.cloud.ibm.com/v3/sql_jobs?type=batch&instance_crn=crn:v1:bluemix:public:sql-query:in-che:a/e31b7085afca4ab8b6ac9b1077cd8af9:29b693b9-b195-4549-a2b0-03c93a26e3d1::"  \
 -H "Accept: application/json"  \
 -H "Authorization: Bearer <API_TOKEN>" 

#type=batch or type=stream

#Copy from 1 parq to another
curl -XPOST  \
--url "https://api.sql-query.cloud.ibm.com/v3/sql_jobs?instance_crn=crn:v1:bluemix:public:sql-query:in-che:a/e31b7085afca4ab8b6ac9b1077cd8af9:29b693b9-b195-4549-a2b0-03c93a26e3d1::"  \
-H "Accept: application/json"  \
-H "Authorization:Bearer <API_TOKEN>"  \
-H "Content-Type: application/json"   \
-d '{"statement":"SELECT * FROM cos://jp-tok/cloud-object-storage-7d-cos-standard-gsi/test2Cols.parquet STORED AS PARQUET INTO cos://jp-tok/cloud-object-storage-7d-cos-standard-gsi/test2Cols_result"  }'

Run Spark Job on COS Data

  • login to IBMCLOUD
  • Goto Catalog > Search for Watson Studio
  • Agree to terms and conditions> Click on Create 
  • Click On next >Next > click Create Watson Studio
  • Click on Projects > New Project >Empty Project
  • Add to Project > Notebook 
  • Select Runtime > python (least configuration)
!pip -q install ibmcloudsql
import ibmcloudsql

cloud_api_key="Create api key from Manage"
sql_crn="crn of SQL Query Instance"
sql_cos_endpoint="cosendpoint of bucket/result_prefix"
query="right click on the COS parq file and click on SQL Query"

sqlClient = ibmcloudsql.SQLQuery(cloud_api_key, sql_crn, sql_cos_endpoint) 
#sqlClient =ibmcloud.sqlSQLQuery(my_ibmcloud_apikey, my_instance_crn)

res=sqlClient.run_sql(query)
  • You can create a job and run the notebook at a specific time and results can be seen in the Jobs tab.

Note :

  1. Any file you drag and drop in Notebook will automatically get saved into COS . 
  2. Click on insert code to add spark code to work on the Dataframe.


Ref:
  1. https://cloud.ibm.com/docs/sql-query
  2. https://medium.com/codait/analyzing-data-with-ibm-cloud-sql-query-bc53566a59f5
  3. https://cloud.ibm.com/docs/sql-query?topic=sql-query-data-transport-automation-to-db2-on-cloud
  4. https://www.ibm.com/cloud/blog/announcements/automate-serverless-data-pipelines-for-your-data-warehouse-or-data-lakes
  5. https://dataplatform.cloud.ibm.com/exchange/public/entry/view/4a9bb1c816fb1e0f31fec5d580e4e14d
  6. https://cloud.ibm.com/docs/sql-query?topic=sql-query-sql-reference
  7. https://video.ibm.com/playlist/633112 #https://www.youtube.com/watch?v=s-FznfHJpoU
  8. https://cloud.ibm.com/apidocs/sql-query-v3#introduction #REST
  9. https://cloud.ibm.com/apidocs/db2-on-cloud/db2-on-cloud-v4
  10. https://video.ibm.com/playlist/633075 #jupyter notebook
  11. https://cloud.ibm.com/docs/AnalyticsEngine?topic=AnalyticsEngine-working-with-sql#running-spark-sql-with-scala
  12. https://github.com/IBM-Cloud/sql-query-clients
  13. https://github.com/IBM-Cloud/sql-query-clients/tree/master/Python

Monday, October 11, 2021

Bazel

 Creating Bazel Project 

Fast Step up Guide

1.  make sure bazel is installed in your computer 
2.  create a new folder as Project
3.  cd inside the project folder
4.  create a new "WORKSPACE" file
5.  create python/Program folder
6.  cd to Program
7.  Create a new file BUILD file:

    package(default_visibility = ["//visibility:public"])
    py_binary(
        name = 'hello', #anyname
        main = 'hello.py', #reference path eg:  parentfolder.file
        srcs= ['hello.py'], #filename 
    )

8.  $echo "print('hi')" > hello.py 
9.  make sure ur in folder containing BUILD file
10. $bazel run hello

Bazel has default setting for Python and Java ie., u can start with empty WORKSPACE and run python/java source files .

Refer for other languages:
https://docs.bazel.build/versions/4.2.1/rules.html

Other Languages (example scala):


You need to configure workspace - http_archive , skylib , language specific rules , Maven .

  1. start with http_archive -support to download package from https
    1. load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
  2. Download skylib - support for shell command 
    1. skylib_version = "0.8.0"
      http_archive(
      name = "bazel_skylib",
      type = "tar.gz",
      url = "https://github.com/bazelbuild/bazel-skylib/releases/download/{}/bazel-skylib.{}.tar.gz".format (skylib_version, skylib_version),
      sha256 = "2ef429f5d7ce7111263289644d233707dba35e39696377ebab8b0bc701f7818e",
      )
  3. Load 
    1. rules_scala  : like scala_binary,scala_test  etc., to use in BUILD file
    2. scala_config : Config scala version
    3.  scala_register_toolchain  : For using the jar file build from 1 languge as input to another
    4. scala repositories : to download default libraries for scala 
  4. Set maven as third party repo

IntelliJ Setup

1. Make sure intelliJ has bazel plugin installed
2. import above project as basel project
3. create new
4. next ( if u already have .ijwb/ folder created , make sure it is deleted)
5. done

Common Commands :

  • bazel build target #target can be name of build or //path of package:target
  • bazel run target
  • bazel test target
  • bazel coverage target
  • bazel query deps(target)
  • bazel fetch target
  • bazel version
  • bazel clean --expunge

Advantages:

  • Google product
  • Language independent
  • Platform Independent (MAc, Linux etc)
  • Hermatic ( build exactly every time )
  • Cross Langauge Dependencies  (Python libarary can call Java binary etc., )
  • Large Code base
  • Caches Dependencies
  • Parallel Builds
  • Enable Remote  (download Dependencies at remote)
  • Dependency Tree Feature
  • Query Dependencies

Cons : 

  • Network Dependency (1st time and new Dependency)
  • Enlist every Dependency (ie., If Dependency is using another Dependency. It has to be decalared)
  • Manually Decalaring all Dependencies might have version conflicts when 1 library uses 1 version and other another

Features

  • Bazel follows python syntax
  • Workspace: Workspace is a Folder with WORSPACE file also called as Bazel Repo.
  • package :Is a Folder inside Bazel Repo with BUILD file .This folder contains Src code files and other files
  • Target :Everything inside your packages can be considered target
  • Label:The nomenclature of a target is known as a label. It’s just a way to recognize different targets 
  • .bazelrc :Settings that are taken into account every time Bazel builds your project.
  • buildifier : Used to ensure that all your build files are formatted in a similar fashion

WORSPACE file

Enlists all external repo the bazel repo is dependent on

 Example :

workspace(name="intro_to_bazel") #name of the workspace

#load("filename","method")
load(“@bazel_tools//tools/builds_defs/repo:git.bzl”, "git_name") 
git_name(
    name= "com_github_xxx",
    commit="xxxxxxxxxxxxxxxx",
    remote="https://github.com/xxx" 
)

Rule Definition in WORSPACE

  • Example:load("//foo/bar:file.bzl", "some_library")
  • This code will load the file foo/bar/file.bzl and add the some_library symbol to the environment. 
  • This can be used to load new rules, functions or constants (e.g. a string, a list, etc.).
  • *_binary rules build executable programs in a given language. After a build, the executable will reside in the build tool's binary output tree 
  • *_library rules specify separately-compiled modules in the given programming language

  • *_test rules are a specialization of a *_binary rule, used for automated testing

Note :

  • https://github.com/bazelbuild/examples/tree/main/java-tutorial
  • In this project WORKSPACE is empty because  Native rules ship with the Bazel binary and do not require a load statement. Native rules are available globally in BUILD files.
  • But for scala ,python etc u need to include load statements in workspace and use them in Build files

Steps:

  • Open link https://github.com/bazelbuild
  • select repos u need for creating ur project
  • Example if u want to add "bazel-skylib" (Provides functions , file paths, and data types in build file)

####### WORSPACE ########

load("@bazel_skylib//:workspace.bzl", "bazel_skylib_workspace")
bazel_skylib_workspace()


BUILD File
load("@bazel_skylib//lib:paths.bzl", "paths")
load("@bazel_skylib//lib:shell.bzl", "shell")
p = paths.basename("foo.bar")
s = shell.quote(p)

  • Since scala does  not directly ship with bazel u need to include the "rules_scala" from bazelbuild in Workspace
  • And Use scala_binary , scala_library scala_test etc., to build and test 

BUILD

  • Folder with BUILD is called Package
  • Contains rules. scala_binary, java_binary etc.,

Example:

common/BUILD 
scala_library(
    name = "common",
    srcs = glob(["*.scala"]),
    visibility = ["//visibility:public"],
)


source/BUILD
scala_binary(
    name = "eid",
    srcs = glob(["filename.scala"]),
    main_class = "com.company.project.filename",
    deps = [
        "//path/common",
    ]
)

  • xxx_library takes sources , targets and label (ie., path for other Bazel Packages)
  • xxxx_library create a libarary 
  • //packagename:target
    • // - root
    • packagename - name of the Builds
    • target - particular target inside a package

  • srcs dependencies :Files consumed directly by the rule or rules that output source files.
  • deps dependencies: Rule pointing to separately-compiled modules providing header files, symbols, libraries, data, etc.
  • data dependencies:A build target might need some data files to run correctly.

Query Dependencies 

  • bazel query "deps(//foo)"
  • bazel query "allpaths(//foo, third_party/...)"
  • bazel query --noimplicit_deps 'deps(//package:target)' --output graph | dot-Tpng >graph.png
#if u are already inside the package
  • bazel query --noimplicit_deps 'deps(target)' --output graph | dot-Tpng >graph.png 
  • bazel query --noimplicit_deps 'deps(microservice)' --output graph | dot-Tpng >graph.png
  • bazel query --noimplicit_deps 'deps(microservice)' --output graph > simplified_graph.in

  • bazel query 'foo/...' --output package # What packages exist beneath foo?
  • bazel query 'kind(rule, foo:*)' --output label_kind #What rules are defined in the foo package?
  • bazel query 'kind("generated file", //foo:*)' #What files are generated by rules in the foo package?
  • bazel query 'attr(generator_function, foo, //path/to/search/...)' #What targets are generated by starlark macro foo?
  • bazel query 'buildfiles(deps(//foo))' | cut -f1 -d: #What's the set of BUILD files needed to build //foo?
  • bazel query 'tests(//foo:smoke_tests)' #What are the individual tests that a test_suite expands to?
  • bazel query 'kind(cc_.*, tests(//foo:smoke_tests))' #Which of those are C++ tests?
  • bazel query 'attr(size, small, tests(//foo:smoke_tests))' #Which of those are small? Medium? Large?
  • bazel query 'filter("pa?t", kind(".*_test rule", //foo/...))' #What are the tests beneath foo that match a pattern?
  • bazel query path/to/file/bar.java --output=package #What package contains file path/to/file/bar.java?
  • bazel query path/to/file/bar.java #What is the build label for path/to/file/bar.java?
  • bazel query 'buildfiles(deps(//foo:foo))' --output package #What packages does foo depend on?
  • bazel query 'deps(foo/... except foo/contrib/...)' --output package #What packages does the foo tree depend on, excluding foo/contrib
  • bazel query 'kind(genproto, deps(bar/...))' #What genproto rules does bar depend upon
  • bazel query 'kind("source file", deps(//path/to/target/foo/...))' | grep java$ #What file dependencies exist
  • bazel query 'deps(//foo) except deps(//foo:foolib)' #What targets does //foo depend on that //foo:foolib does not?
  • bazel query 'somepath(bar/...,groups2/...:*)' #Why does bar depend on groups2

Rules 

Read output of build file in another build files

Ref:


Sunday, October 10, 2021

Java Package Names and Naming Convention:

 Java Package Names and Naming Convention:

  • If you're just doing personal projects where nobody else will use the code, then you can make any name .
  • Don't make up something that starts with com. or net. or other top-level domain though, because that would imply that you own the domain name (ie. using com.john as your package name just because your name happens to be John is not a good idea).
  • The domain-name-backwards convention is there to prevent name collisions. Two different companies with the same product name will have different namespaces so everything works fine.

Ref:

  • https://stackoverflow.com/a/292175
  • https://docs.oracle.com/javase/tutorial/java/package/namingpkgs.html
  • https://stackoverflow.com/a/6247924