Showing posts with label Programming Concepts. Show all posts
Showing posts with label Programming Concepts. Show all posts

Thursday, November 10, 2016

How to determine/specify encoding?

  • In Email - Content-Type: text/plain; charset="UTF-8"
  • In Client/Web page -  set encoding using the following (ordered from highest to lowest priority of browsers):
    1) charset parameter on HTTP Content-Type response header from server
    Content-Type: text/plain; charset=UTF-8
    2) charset on meta tag/element of HTML response
    <html>
    <head> <!-- meta tag must be the very first thing in the <head> section because when encountered by the browser it stops parsing the page and reinterpretes it  -->
    <!-- charset attribute was introduced on HTML5 and is more recommended to use -->
    <meta charset="UTF-8">
    <!-- use http-equiv attribute for HTML versions lower than HTML5 -->
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  • In Spring, to specify the encdoing to be usd in decoding form data - add filter for encoding
    <filter>  
        <filter-name>EncodingFilter</filter-name>  
        <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>  
        <init-param>  
           <param-name>encoding</param-name>  
           <param-value>UTF-8</param-value>  
        </init-param>  
        <init-param>  
           <param-name>forceEncoding</param-name>  
           <param-value>true</param-value>  
        </init-param>  
    </filter>

More on specifying encoding: http://stackoverflow.com/questions/138948/how-to-get-utf-8-working-in-java-webapps

Possible consequence of not specifying encoding:

Thursday, April 17, 2014

Scrum

Scrum

 - one of the best Agile practices

Product back log

 - user stories/wish list

Roles

  • product owner - chooses what to deliver from product back log
  • scrum master - similar to PM
  • developers
  • testers
  • customers
  • executives

Release planning

  • release backlog with estimates
    estimate by story point
    estimate by hour - 1, 2, 4, 8 hours or 2, 3, 5, 10 days or 1, 2 , 3, 6 months
  • sprints - short duration milestones (2 to 30 days) - > ship ready state
  • burndown chart - work remaining by time(day) chart
    burndown velocity - ave. rate of productivity, slope
  • sprint backlog
  • daily scrum - stand up meetings, completed tasks & obstacles
  • sprint restrospective - what went right? and what went wrong/areas of improvement?

Software Development Methodologies

A software development methodology or system development methodology in software engineering is a framework that is used to structure, plan, and control the process of developing an information system. 

Broadly these are:
  1. Software development life cycle methodology
  2. Agile methodology 

There are many models under these methodologies
  1. Software development life cycle:
    • Waterfall: a linear framework 
    • Spiral: a combined linear-iterative framework 
    • Incremental: a combined linear-iterative framework or V Model 
    • Prototyping: an iterative framework 
    • Rapid application development (RAD): an iterative framework
  2. Agile methodology:
    • Scrum 
    • Extreme programming
    • Adaptive software development (ASD) 
    • Dynamic system development method (DSDM)
Sources:
http://www.itinfo.am/eng/software-development-methodologies
http://en.wikipedia.org/wiki/Software_development_methodology

Sunday, September 22, 2013

Understanding Character Sets, Encoding and Unicode

Character Set/charset

- set of characters that may or may not define an encoding
- Examples: ASCII (covers all English characters), ISO/IEC 646, Unicode (covers characters from all living languages in the world)

Encoding/Character encoding/Character set encoding

- General meaning: a set of rules or system for representing a character in some form such as bit pattern, sequence of natural numbers, octets, or electrical pulses, e.g. Morse code, Baudot code, ASCII and Unicode
- More strict meaning: a mapping of characters to how they are stored in memory (bit sequence)
- Examples: ASCII encoding, Unicode encodings like UTF-8 and UTF-16


Source of Encoding Standards:

  1. Standards bodies
    ANSI (American National Standards Institute)
    - is the U.S. standards organization that creates standards (like the ASCII) for the computer industry
    ISO (International Organization for Standardization)
    - largest developer of voluntary International Standards
    - adopted ASCII as ISO 646:IRV
  2. Independent software vendors
    IBM
    - developed codepage 437 for DOS, codepage 852 for Eastern European languages that use Latin script, codepage 855 for Russian and some other Eastern European languages that use Cyrillic script, etc.
    Windows
    - developed the familiar Windows codepages, such as codepage 1252, alternately known as "Western", "Latin 1" or "ANSI"


Examples of character sets or encodings


ASCII (American Standard Code for Information Interchange)
- is a 7-bit encoding scheme used to encode letters, numerals, symbols, and device control codes as fixed-length codes using integers
- includes definitions for 128 characters
- 128 to 255 is free causing varied character representation of 128 to 255 resulting to varied ASCII extensions


EBCDIC (Extended Binary Coded Decimal Interchange Code)
- is an 8-bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems.


Codepage 1252 and ISO 8859-1
- ISO 8859-1 “Latin 1” is a standard developed by American National Standards Institute (ANSI)
- Codepage 1252 is a standard created by the Microsoft for Western European languages based on an early draft of the ANSI proposal that later became ISO 8859-1 “Latin 1”
- Codepage 1252 was finalised before ISO 8859-1 was finalised, however, and the two are not the same: Codepage 1252 is a superset of ISO 8859-1

ANSI codepage
- Microsoft referred Codepage 1252 as "the ANSI codepage" but around the time of Windows 95 development, Microsoft began to use the term "ANSI" in a different sense to mean any of the Windows codepages, as opposed to Unicode
- currently in the context of Windows, the terms "ANSI text" or "ANSI codepage" should be understood to mean text that is encoded with any of the legacy 8-bit Windows codepages rather than Unicode. It really should not be used to mean the specific codepage associated with the US version of Windows, which is Codepage 1252.

Other Legacy encoding standards
- most encode each character in terms of a single 8-bit processing unit, or byte
- some are double-byte encodings like Microsoft codepages for Chinese, Japanese and Korean


UTF-8 and Unicode


Unicode
- is a standard developed by the Unicode Consortium that assigns a unique number/identifier for every character, no matter what the platform, no matter what the program, no matter what the language
- In Unicode, every character is assigned a unique number called "code point"

Ways of Encoding Unicode

  1. UCS-2 (because it has two bytes) - the traditional store-it-in-two-byte methods
  2. UTF-16 (because it has 16 bits) - you have to figure out if it's high-endian UCS-2 (most significant byte first) or low-endian UCS-2 (least significant byte first) through the BOM (byte-order mark)
  3. UTF-8 (Unicode Transformation Format 8-bit)
    - is a variable-width encoding that can represent every character in the Unicode character set. It was designed for backward compatibility with ASCII and to avoid the complications of endianness and byte order marks in UTF-16 and UTF-32 
  4. UTF-7 - similar to UTF-8 but guarantees that the high bit will always be zero
  5. UCS-4 - stores each code point in 4 bytes


Other related terms


Code Page

- is a term that originated from IBM that essentially means the same as character set and encoding

Internationalized URL / URL encoding / Percent encoding 

- see https://www.w3.org/International/articles/idn-and-iri/http://www.url-encode-decode.com/

Sources:
http://www.unicode.org/
http://en.wikipedia.org
http://www.joelonsoftware.com/articles/Unicode.html
http://scripts.sil.org/cms/scripts/page.php?item_id=IWS-Chapter03
http://mikesusan.com/ascii.html
http://www.utf-8.com/
http://kunststube.net/encoding/

Sunday, June 2, 2013

SSL and certificates

Terms:

  • SSL (Secure Socket Layer) - a security protocol that ensures secure transaction/connection between a server and a client
  • https - beginning of an SSL-secured website/URL
  • SSL Certificate - a small data file that establishes encrypted connection. It contains a key pair, a public and private key, and the subject identifying the certificate. Typically an SSL Certificate will contain your domain name, your company name, your address, your city, your state and your country. It will also contain the expiration date of the Certificate and details of the Certification Authority responsible for the issuance of the Certificate.
  • Certificate Authority or CA - the SSL Certificate issuer. It researches companies, checks references, assures identity and encrypts data to and from servers. 
  • certificate chain - a series of intermediate certificates
  • public, private, and session keys - anything encrypted with the public key can only be decrypted with the private key, and vice versa. After the secure connection is made, the session key is used to encrypt all transmitted data. 

Server Setup: (http://www.lwithers.me.uk/articles/cacert.html)

  1. In order for a server to handle SSL connections, it must activate SSL.
  2. Server will be prompted several question about identity of website or organization.
  3. Server generates the CSR (Certificate Signing Request). It contains the private key and a CSR data file.
  4. The CA uses the CSR data file to create a public key to match the private key.
  5. CA sends the SSL certificate.
  6. Server installs the SSL certificate. (http://www.digicert.com/ssl-certificate-installation.htm)

How it works:
  1. Browser connects to a web server secured with SSL (https). Browser requests that the server identify itself.
  2. Server sends a copy of its SSL Certificate (including the server’s public key), to assure the client that it can be trusted. The SSL Certficate was purchased from CA.
  3. Browser checks the certificate root against a list of trusted CAs and that the certificate is unexpired, unrevoked, and that its common name is valid for the website that it is connecting to. If the browser trusts the certificate, it creates, encrypts, and sends back a symmetric session key using the server’s public key. --- "SSL handshake"  
  4. Server decrypts the symmetric session key using its private key and sends back an acknowledgement encrypted with the session key to start the encrypted session.
  5. Server and Browser now encrypt all transmitted data with the session key.

Commands:

  • the default password is changeit
  • list certificates
    keytool -list -v -keystore [cacert location], ex. keytool -list -v -keystore cacerts.jks
  • list certificates to a text file
    keytool -list -v -keystore [cacert location] > [text file path]
    keytool -list -v -keystore "C:/Program Files (x86)/Java/jre6/lib/security/cacerts" > java_cacerts.txt
  • delete certificate (used when certificate is expired)
    keytool -delete -v -alias [alias] -keystore [cacert location], ex. keytool -delete -v -alias [alias] -keystore cacerts.jks
  • add certificate to cacert
    keytool -import -alias [alias name] -keystore  [cacert location] -file [cert to add path]
    keytool -import -alias Verisign -keystore  "C:/Program Files (x86)/Java/jre6/lib/security/cacerts" -file C:/bel/docs/certs/Verisign.cer


Source:
http://www.digicert.com/ssl.htm

Sunday, January 13, 2013

Software Testing

Testing Levels
  • Unit testing
    • done in local/DEV
    • testing fixes individually
  • Integration testing
    • done in SIT
    • testing integrated modules
    • deals with integration of a process in the system, not the integration of the whole system
  • System testing
    • done in SIT
    • testing the system as a whole
    • Types of system testing:
      • Usability testing – this is how well the user can access the different features in the system and how easy it is to use.
      • GUI software testing – this is to check if graphically that the program looks how was intended and the GUI works as intended.
      • Security testing – this would be to check if important information is secure and if there are certain access restriction that they work.
      • Accessibility – how easy is it for various users including users with disability to use the system.
      • Reliability testing – to check that the system works for long period of time and does not constantly crash.
  • User acceptance testing
    • done in UAT
    • obtain confirmation that a system meets mutually agreed-upon requirements

Wednesday, April 22, 2009

Design Patterns

Definitions on this blog came from the book Head First Design Patterns.

  • Strategy pattern - defines a family of algorithms, ancapsulate each one, and makes them interchangeable. Strategy lets the algorithm vary independently from clients that use it.

  • From http://www.ida.liu.se

  • Observer pattern - defines a one-to-many dependency between objects so that when one object changes state, all its dependents are notified and updated automatically.
    - think Swing/GUI

  • From http://kur2003.if.itb.ac.id

    Using Java's built-in Observer pattern (allows observer to pull data from observable)...
    From http://kur2003.if.itb.ac.id


  • Decorator pattern - attach additional responsibilities to an object dynamically. Decorators provide a flexible alternative to subclassing for extending functionality.
    - think Java I/O

  • From http://oreilly.com



Ref:
Head First Design Patterns by By Eric Freeman, Elisabeth Robson, Kathy Sierra, and Bert Bates

Monday, April 13, 2009

Web Application Basics

HTTP - Hypertext Transfer (or Transport) Protocol), is a connectionless protocol for communicating clients/browsers and web servers

TCP/IP - Transmission Control Protocol, allows communication between your application software. It is responsible for breaking data down into IP packets before they are sent, and for assembling the packets when they arrive.
-connection oriented
-used to make connections and exchange information to one another
IP - Internet Protocol, is responsible for sending the packets to the correct destination
- connectionless

URL -Uniform Resource Locator, is a full specification of a resource. It includes the protocol, host machine name (domain name or IP address), optional protocol number and resource location.
Three ranges of port numbers:
  1. well-known ports - from 0 through 1023
    20 & 21: File Transfer Protocol (FTP)
    22: Secure Shell (SSH)
    23: Telnet remote login service
    25: Simple Mail Transfer Protocol (SMTP)
    53: Domain Name System (DNS) service
    80: Hypertext Transfer Protocol (HTTP) used in the World Wide Web
    110: Post Office Protocol (POP3)
    119: Network News Transfer Protocol (NNTP)
    143: Internet Message Access Protocol (IMAP)
    161: Simple Network Management Protocol (SNMP)
    194: Internet Relay Chat (IRC)
    443: HTTP Secure (HTTPS)
    465: SMTP Secure (SMTPS)
  2. the registered ports - from 1024 through 49151
    - they can be registered to specific protocols by software corporations/companies or users
    - assigned by or registered to Internet Assigned Numbers Authority (IANA) (or by Internet Corporation for Assigned Names and Numbers (ICANN) before March 21, 2001[1])
  3. dynamic or private ports - from 49152 through 65535
    - available for use by any application or just about anybody
HTML - is the principal language b/w the client and server that expresses content of webpages.



  • DOM - is an interface to the browser and HTML/XML documents.

  • DHTML
    Dynamic HTML is a term used to describe the combination of HTML, style sheets, and scripts that allow dynamic pages.

  • Javascript - is the most common scripting language for browsers.

  • Inversion of Control and Dependency Injection

  • Martin Fowler, who first coined Dependency Injection, considers it the same to Inversion of Control. He used Dependency Injection since he finds Inversion of Control too generic. However, other sources claim that Dependency Injection is just a form of Inversion of Control. With all the articles and blogs written about them, it is very easy to get confused. So I'll try to summarize their description here according to my own understanding and hopefully, it will be simple and easy to understand.

    Inversion of Control is a general principle in which the flow of control is inverted, unlike the traditional sequential flow. It follows the "Hollywood principle" - "don't call us, we'll call you". Your objects don't call services directly. Instead, your objects expect to be called. It is the framework that manages the objects and services, and is aware of what to instantiate and invoke.

    The main idea of Inversion of Control is to have none of your classes know or care how they get the objects they depend on.

    Dependency Injection is a way of implementing Inversion of Control in which an external mechanism is supplied in order for your objects not to call services (other objects) directly and therefore not depend on them. The external mechanism is the one responsible in injecting the concrete implementation that your objects needed.

  • Loose coupling - describes a relationship of two entities/objects where they can interact, but have very little knowledge of each other.


To r: inline css vs external css
What is robustness?
Disadvan of frames

Website-static
Webapp-dynamic

Get vs post- when to use get?

Session - uses cookies

Enabling technoligies:
Compiled modules-java servlets
Interpreted scripts-jsp