Abstract: Mammoth data is being created in stock market round the globe. In the era of global village, stack holders and participators in stock market is developing its own repositories. Due to this phenomenon, same data is being deposited at different places and repeated again and again. Repetition of data requires resources and it leads to the wastage of cost, energy, power, space etc. In market is relatively less mature than develop market and due it, large unplanned and unused data had been stored in the database of intermediaries and other stack holders. This paper is the continuation of our previous work of green database model for stock market. We already discussed the database based on integer value [8] and referential value [9] but green database model is till need some improvements. This paper discusses the database model for stock market as a case study of Indian stock market and proposes a green database model. It also unearth the questions arises after integer value and referential value databases. It also tries to calculate its green effect.
Keywords: green database, integer value database, referential value database, stock market, repositories.
I. INTRODUCTION
According to the data available with SEBI (Security and exchange board of India), there are 83,808 sub-brokers are registered in the Indian stock market by the end of financial year, 2011. There are more than 30 stock exchanges in India. Billions of transactions took place every day in Indian stock markets and each broker needs customized set of data for forecasting. That again requires saving data in its own customized way with riders. Not only sub-brokers but other players like mutual funds, insurance companies as well as customers all are playing with these stock market data in one way or another. Investment in the market is a continuous process and all majorities of interested parties are developing its own database and customizing its reports as per their needs. Similar or same data has been stored at many databases. A large amount of space, time, power, price etc are requiring to provide such kind of facility to these interested parties. Ultimate goal of these player are to generate reports with the statistical methodologies and try to predict the market for the future investment pattern. This paper discussed a new methodology to reduce number of data in the database and try to suggest a methodology to save resources without compromising quality of forecasting and investment methodologies.
II. LITERATURE REVIEW
There is dearth of literature available for the Database design and customization for the stock market data. Thinker are primarily focused on the demand side of the market and due to it, majority of it are focused on forecasting. But whether it is forecasting, reporting or anything else, all are dependent on the data and its behavior. Application of statistics and other related tools are very common in the stock market. With the rapid development and globalization of financial markets (especially emerging financial markets), financial information processing has become a hot research area due to its immense practical applications. Such applications include stock market analysis, foreign exchange rate forecasting, option pricing, bank failure prediction, financial risk management, credit rating and scoring, bank loan management, customer relationship management, and antimony laundering [12]. Some literature is available on to find out relationship between market and internet. This suggests that, in a booming economy, the price of Internet stocks has risen faster than that of standard stocks. Therefore, Internet firms will have lower excess returns on their stocks. Dependency on the Internet appears to be negatively related to the stock excess return and is the only significant variable in 1998 [1]. A study on Taiwan stock exchanges shows that as there many factors affecting Taiwan stocks, closing prices of stocks are highly random. Therefore, a closing price forecast model should be as precise as possible [2]. Relationship between technology indicators and stock market performance has been studies by Patrick Thomas in 2001 [11]. Because of these computational advantages, time-series analysis always plays a significant role in the financial markets, where predictions (the decision making processes) are made based solely on the historical movements of the stock prices (time-series) [3]. There is dearth of literature relating to integer value of stock market database. Technology is crucial in forecasting methods. In a booming and developing economy like India, dealing with energy, space, time and power demands Priority. Choosing a particular technology to implement business strategy may have a significant impact a firm’s stock performance [1]. Semantic access to databases has a long history and originated at early stages of database technology development. Unfortunately, they have not led yet to the creation of widely accepted industrial technologies [2]. The market mechanism combined with software agents adjusts demand and supply through interaction among agents using prices as common information[6]. Data mining method integrates discretization, generalization and rough set feature selection. These methods reduce data horizontally and vertically. In the first phase, discretization and generalization are integrated. Numeric attributes are discredited into a few intervals [13]. In the early stages of the market, volume of data is relatively low and economies are relatively closed. In the closed economies like India, markets were dependent on policies of the countries. Other country’s policies have no or less impact on the other market. But now, economies are getting global and one’s policies affect other’s economies and due to which dynamics of data & information are changing very fast. In this dynamic scenario, volume of data is increasing at a tremendous pace. Future forecasting of the market requires new and advance tools & methodologies which can deal with these challenges. With the availability of the ever increasing data and information and new technologies being developed, companies must innovate different database technologies in order to keep up with the latest ideas [4]. Advances in computational intelligence have created opportunities that were never there before [5]. Efficiency of the stock market is aspects of efficiency of information and operation, and sets up verifiable hypotheses as a quantitative approach [7]. The periodic distance measure operates in the frequency domain and can effectively identify sequences derived by the same generative process, allowing for flexible matching even under conditions of time-lag [10]. There is great need for more flexible structural similarity measures between time series sequences, which are based on the extraction of important periodic features. Non-parametric methods for accurate periodicity detection and new periodic distance measures for time-series sequences are being considered [11]. Green practices in the stock market are very less in the countries like India. Hindrances like low education rate and poor administration played a crucial role in the implementation of greener technologies. Integer value data with integration of
other data set will save energy, time, cost, space etc giving the same efficiency in forecasting of the market. These all studies were focused on either on relation between internet and stock market or forecasting of the market with internet. Although internet played a crucial role in the market but storage and processing of the same data again and again is the real challenge for the researchers in the area of green computing. This paper majorly focus on the database of the market players and try to reduce its cost, power consumed, energy required and space utilized to stock these data.
III. GREEN DATABASE FOR STOCK MARKET: A CASE STUDY OF INDIAN STOCK MARKET.
Green database proposed in the paper is the combination of integer value and address of the data. Stock prices are available in the form of floating point value and every value is different from one way or another. Graphs and reports generated by investors can’t be bothering about these fractional values. Although there are number of predictive methodologies and hypothesis available for the future prediction of the market but these predictions are never be exact. It only suggests a future pattern of the market with limited scope of deviation. This database is design to save energy, cost, power, space etc. There are two suggestive green measures for the market database. One is to save all data in full length at a central place like NSE (national Stock Exchange) or BSE (Bombay Stock Exchange) and others can use it with the help of advance information technologies like cloud computing. Second method is to save all data in full length at a central place like NSE (national Stock Exchange) or BSE (Bombay Stock Exchange) and others can also save data by using this proposed database. This database is created and reorganized in two parts. First part focuses on converting and reorganizing existing data into new format and second part is for the new data entered in the database. First part is being done with the help of algorithm-01, in which inter value [8] and referential value [9] method is combine at a place and for the second part we apply althorithm-02. Both algorithms are given below:-
Algorithm-01
1. Start:
2. Initialize database:
3. Initialize col(open), col( high), col(low) and col(close)
\ in step4 all values in the matrix is converted into integer except date col.
4. Newtable(Val) = int(oldtabVal)
\ Step 5 to 8 is for comparing closing value column with the other value of same column and delete duplicate rows.
5. For (start = first data of col(close); Start =
eof(col(close)); Start ++)
6. Compare Start(val) with col(close)
7. If found same value
8. Then delete entire row of duplicate closing value and search for the next value in the closing column of the matrix.
\ save new matrix as new matrix and termed as extended matrix and denoted as extmat
\ Step 9 to 14 is for comparing all values in the extended matrix and replacing duplicate values with address of previous value if they are the same or equal
9. For (extstart = first data of extmat; extStart = eof(extmatrix); extstart ++)
10. Compare extstart(val) with all values(extmat)
11. If found same
12. Then replace duplicate value with previous value’s address and search for the next value of the extended matrix
13. Save all duplicate values with address of previous value in the database of extended matrix
14. Save new database.
15. Exit.
Algorithm-02
1. Start:
2. Initialize database:
3. Initialize col(open), col( high), col(low) and col(close)
\ Step4 except new value generated from
the market in database
4. Newval = enter new value
\ Step 5 to 10 is for comparing new value
with all other existing values in the database
and if new value matched with any existing
one then save address of the previous value
otherwise save new value.
5. For (start = first data of database; Start = eof; Start ++)
6. Compare Start with newval
7. If found same value
8. Then saved address of the previous value saved the database
9. Else save newval in the database
10. Save new database.
11. Exit.
Algorithm-01 is applied for the existing database and algorithm-02 is used for the new entries. This model is tested on the data of 10 different companies like SBI (State Bank of India), TCS, Bajaj auto, Bharti airtel, GAIL, ICICI bank, Jaypee, LIC, Tata steel and RIL (Reliance Industries Limited) available of NSE and BSE. In section-4, results shown on SBI data. After the application of algorithm-01, 50 percent (Approximate) of
redundant data has removed from database and after application of algorithm-02, remaining database took 20 percent (Approximate) less space. These all are tested and exemplified in the next section of the paper.
IV. COMPARATIVE STUDY OF GREEN DATABASE WITH TRADITIONAL DATABASE FOR STOCK MARKET
Traditional database used in the stock market is storing all price value of the stock in the floating value format and every day it becomes more bulky. Green database is discussed and design for the implementation of green concept in the field of computer science. In table 01, traditional database is shown in the form of matrix. It stores floating value prices from Jan/2000 to Dec/2012 of State Bank of India (SBI) is displayed. In this matrix, total number of values is 3242 (No. of Row) * 05 (No. of Col.) = 16210. In table 02, all values of matrix in table 01 is converted into integer value accept data column and redundant rows should be search and removed along with closing price column. In this matrix, total number of values remains is 1602 (No of Rows) * 05 (No. of Col.) = 8010. In table 03, redundant value is searched in the table 02 one by one and all redundant values except first one is replace by address of the first value. After this step, number of values replace with the address is approximately 20 percent of the previous value.
Date Open High Low Close
03-01-2000 219.27 228.99 219.27 228.99
04-01-2000 231.16 246.73 225.02 245.03
05-01-2000 241.16 246.25 226.44 234.65
06-01-2000 235.88 251.87 235.88 244.37
– – – – –
– – – – –
24-12-2012 2340 2359 2325 2331.3
26-12-2012 2336.7 2380 2323 2370.85
27-12-2012 2370.85 2397.5 2362.7 2388.35
28-12-2012 2394 2397.35 2369.1 2379.5
31-12-2012 2370 2396.7 2368.65 2385.5
Table 01: Prices of SBI from Jan/2000 to Dec/2012 before integer value conversion at NSE, India [14].
Date Open High Low Close
03-01-2000 219 228 219 228
04-01-2000 231 246 225 245
05-01-2000 241 246 226 234
06-01-2000 235 251 235 244
– – – – –
– – – – –
24-12-2012 2340 2359 2325 2331
26-12-2012 2336 2380 2323 2370
27-12-2012 2370 2397 2362 2388
28-12-2012 2394 2397 2369 2379
31-12-2012 2370 2396 2368 2385
Table ‘ 02: Prices of SBI from Jan/2000 to Dec/2012 after integer value conversion and removal of Duplicated rows along closing values of the stock from table 01 [14].
Date Open High Low Close
03-01-2000 219 228 Address of 219 Address of 228
04-01-2000 231 246 225 245
05-01-2000 241 Address of 246 226 234
06-01-2000 235 251 Address of 235 244
– – – – –
– – – – –
24-12-2012 2340 2359 2325 2331
26-12-2012 2336 2380 2323 2370
27-12-2012 Address of 2370 2397 2362 2388
28-12-2012 2394 Address of 2397 2369 2379
31-12-2012 Address of 2370 2396 2368 2385
Table ‘ 03: Prices of SBI from Jan/2000 to Dec/2012 after substitution of address at all redundant values in the table ‘ 02 [14].
V. MARKET FORECASTING WITH THE HELP OF GREEN DATABASE
Market players require charts for the prediction of the market. We draw different charts with the data of ten companies listed on national stock exchange and found that the variation of the charts are minuscule (from 8 to 10 percent) and in most of the cases, they are same. Two different charts are shown for the example with the data of SBI and reliance industries limited and found similar. In Fig. 01(01), chart is drawn with State Bank of India (SBI) data from table 01 and in Fig. 01(02), chart is drawn with State Bank of India (SBI) data from table 03.
Same chart has been drawn with the help of data of Reliance Industries Limited and shown in Fig 03(01) and Fig 03(02). In Fig 02(01) and Fig 02(02), Candle Stick charts are drawn with the help of data from table 01 and table 03. It is not tough to compare and found that all relative charts are closely equal to each other.
Fig. 01 (01): chart of SBI by using traditional database with data from Jan/2000 to Dec/2012 (From table 01).
Fig. 01 (02): chart of SBI by using green database with data from Jan/2000 to Dec/2012 (From table 03).
Fig. 02 (01): Candle Stick Chart of SBI by using with traditional database with data from Jan/2000 to Dec/2012 (From table 01).
Fig. 02 (02): Candle Stick Chart of SBI by using green database with the data from Jan/2000 to Dec/2012 (From table 03).
Fig 03 (01): chart of Reliance Industries Limited by using traditional database with data from Jan/2000 to Dec/2012 (From table 01).
Fig 03 (02): chart of Reliance Industries Limited by using Green database with data from Jan/2000 to Dec/2012 (From table 03).
VI. GREEN EVALUATION OF GREEN DATABASE
In the current scenario, all participants of the stock markets are using traditional database. Every player is maintaining its own database and large amount of space, time, cost and power has been used worldwide. Same data has been saved again and again. In the era of Green computing revolution, all computer scientists are inching towards green technologies and the present methodology is against it. Proposed new database design will contribute a lot towards green technology. Matrix table 01, contains 3242 (Number of Rows) * 5 (Number of Columns) = 16210 data. Table 02, contains 1602 (No of Rows) * 05 (No. of Col.) = 8010 data. Table 03, contains 6428 data and 1582 addresses. Lets, x is the amount of space required to save one address in the place of data and y amount of space required to save one data in the database then total space saved is 8200 * y + 1582 * x. Billions of data is generated every day in stock market. So, it’s not difficult to assume total amount of space saved by this method. In other words, 55 percent of space has been saved by applying this method. Similarly, power other scarce resources is also saved in the equal proportion and even greater proportion because same data is processed again and again. By applying algorithm-01, around 50 percent of redundant data has been deleted and by applying algorithm ‘ 02, further 20 percent of data has been replaced by address of previous data and it also saves around 10 per cent of the remaining space. So, total space saved by applying both the algorithms is 55 percent. Although it is natural question arises that what about deleted data. Answer to this question is already given in the previous sections. This methodology is applicable to the followers of second method i.e. to save all data in full length at a central place like NSE (national Stock Exchange) or BSE (Bombay Stock Exchange) and others can also save data by using this proposed database. If any player or interested party can need all data then they can call from central agency like NSE or BSE.
VII. CONCLUSION AND FUTURE WORK
The database model proposed will be helpful for all players of the stock market as well as other interested parties. In one hand we are conserving price value of the stocks in its actual formats & in totality and on the other hand if needed other parties can also saved data to its own. Database created by other parties except central agencies can save its cost, space, time and power by implementing these methodologies and without compromising quality of forecasting tools. Analysts generally focus on the charts and are concerned only on the output of the charts. By using the model proposed above we can say that:
1. Green database can save 55 percent of the storage space as so as time, cost, power etc.
2. Green database can be easily being integrated with new technologies like Cloud Computing.
3. This database will be equally efficient and accurate in forecasting.
There is tremendous scope to work in the develop architecture and enhance green database. Integration of this database with other green technologies like cloud computing, mobile computing, HPC etc is the new era challenge and need to be address properly for the stock market data.
REFERENCES
[1] Aurore J. Kamssu, Brian J. Reithel, Jennifer L. Ziegelmayer, ‘Information Technology and Financial Performance: The Impact of being an Internet-Dependent Firm on Stock Returns’, Information Systems Frontiers 5:3, 279’288, 2003.
[2] Chien-Jen Huang, Peng-Wen Chen, Wen-Tsao Pan : ‘Using multi-stage data mining technique to build forecast model for Taiwan stocks’, Springer-Verlag London Limited , DOI 10.1007 / s00521 ‘ 011 ‘ 0628 ‘ 0, 2011.
[3] Di WU, Gabriel Pui Cheong FUNG, Jeffrey Xu YU,QiPAN, ‘Stock Prediction: An event driven approach based on bursty keywords’, Front. Comput. Sci. China. Vol. 3(2), pp. 145’157, 2009.
[4] Eamonn Keogh,Jessica Lin, ‘Clustering of time-series subsequences is meaningless: implications for previous and future research’, Knowledge and Information Systems , Vol. 8, pp. 154’177, 2005.
[5] Edward Tsang, ‘Forecasting ‘ where computational intelligence meets the stock market’, Front. Comput. Sci. China, Vol. 3(1), pp. 53’63, 2009.
[6] Hsing-Wen Wang,’Intelligent forecasting models-selection system for the portfolio internal structure change’, Springer-Verlag, DOI 10.1007/s00500-007-0177-8, 2007.
[7] Ji-Yong Seo&Sangmi Chai, ‘The role of algorithmic trading systems on stock market efficiency’, Inf Syst Front Vol. 15, pp. 873’888, 2013.
[8] Krishna Kumar Singh, Dr. Priti Dimri, Dr. J.N.Singh, ‘Green Database Management System for the intermediaries of Indian Stock Market’, CSIBIG2014, pp. 31 ‘ 35, 2014.
[9] Krishna Kumar Singh, Dr. Priti Dimri, Soumitra Chakraborty, ‘Green Referential Database for the Indian Stock Market’, International Journal of Computer Applications’, Vol. 89(3), pp. 8-11, 2014
[10] Michail Vlachos, Philip S. YU, Vittorio Castelli: ‘Structural Periodic Measures for Time-Series Data’, Data Mining and Knowledge Discovery, Vol. 12, pp. 1’28, 2006.
[11]Patrick Thomas, ‘A relationship between technology indicators and stock market performance’, Jointly published by Kluwer Academic Publishers, Dordrecht Scientometrics, and Akad??miai Kiad??, Budapest, Vol. 51, No. 1, pp- 319’333, 2001.
[12] Shuo BAI1, Shouyang WANG2, Lean YU2, Aoying ZHOU3, ‘Financial information processing and development of emerging financial markets’, Front. Comput. Sci. China, Vol. 4(2), pp. 185’186, 2010.
[13] Xiaohua Hul and Nick Cercone, ‘Data Mining via Discretization, Generalization and Rough Set Feature Selection’, Knowledge and Information Systems ?? Springer-Verlag, pp. 33-60, 1999.
[14] www.nseindia.com