Abstract.
A specifically designed and configured firewall is a good initiation for securing a computer network from malicious users. However complex network environments that hold higher number of participants and endpoints uses multiple undefined dynamic channels require better security infrastructure. Intrusion Detection Systems (IDS) is proposed a solution to deal with multiple threats and direct-indirect attacks along with run time problem such as buffer overflow, String Vulnerability and starvation in less time. Here major problem is speed and our work is focusing for quick search & response by using heuristic-A*search. A*searching technique checks the heuristic admissibility through evaluation function in very short interval. This technique role is to find the target value from the discrete tree structure created by security scanner and place that value into the threshold table for final comparison and remedial action to be taken. As the structure of Hybrid prevention system, it is showing huge complexity related to its performance due to its multilevel structure towards searching and segmentation point of view. Our paper is showing the possible solution by giving the example of distance and optimal path between two cities from the available path.
Keywords: Intrusion System, IDs, Firewall, k-mean mining technique, A* Search, Heuristics search.
1. Introduction
ID system which is use to find the n- categories of malicious inputs via different inputs channels using different security mechanisms and provides safeguards over the network. Major work of any IDS is always analyze the traffic which entering into the network and start differentiation between original packets packet and malicious data[6][9][10]. Here the system classifies the attack identification methods into following ways: flow based, abnormal time base, behavioral based patten. ID system collects and analyzes required information from various components which is deployed for the different purposes over network to identify possible threats that leads the networks system insecure[11][13][14]. With the different network logical configurations, several IDS exist and are reliable in detecting various suspicious actions by different sources. Whenever IDS finds the suspicious packet, instantly it creates an alert. Genetic Algorithm reveals the technical methods for intrusion detection with reliable response. The network connection information is encoded to transform into rules in IDS. Finally result of application of GA is presented. Baker [1][2] has discussed the issue for synthesizing the intrusion by using custom computing machine along with pattern matching and time base intrusion but how it is optimally solved it is upholded. we can find the analysis of Intrusion detection with Network processor (NP)-based network devices which are increasing gradually. From [30][31],Ubicom Network Processor is discussed and expose the embedded Network Intrusion Detection System (NIDS). From [22][23] Li Yong and Gao Guo shown how the New Intrusion Detection Method can be linked with Improved methodology of DBSCAN and novel rule-based data mining. Day by day IDS is getting serious attention towards network security and over viewing the several drawbacks and providing the forecasting for the upcoming problems., that may be the source from signature based anomaly, or host based intrusion otherwise network based, but reason may anything but real job is to do fair work for intrusion detection and prevention. The functioning of triggering alarm in the dynamic world can be mapped to the performance of IDS in the digital era. The IDS should be always updated due to its collision between attackers and security fencing reason is every time attackers is innovative new way to attack and in same way security personal other enhanced dimension to secure the networks from malicious inputs. Due to the huge use of the Internet and online trading has made any organizations more susceptible to virtual threats than ever before which leads the violation of data integrity along with loss of customer confidence linked with job productivity degradation, and conclude with financial crises for the company. According to the 2004 CSI/FBI Computer Crime and Security survey, organizations that acknowledged financial loss due to the attacks (269 of them) reported $141 million lost, and this number has only grown since. [16][17]From the parallel hybrid intrusion prevention system it is shown the optimal results by using multilevel hybrid approach and implementing he k-mean threshold approach which was presented in the international conference FGIT-2015 Korea. Defining the component as follows: (1) Anomaly IDS (2) Signature based IDS (3) network based (NIDS) (4) host based (HIDS) intrusion detection systems. Apart from all IDS we are using few more software devices which will help to secure the system with more optimality by using (5) flow based detector and (6) time slot based detectors [19][20][21]. Several systems are designed and constructed but all they suffer with same issue that is traversal and creation of clusters for diverse data. After collecting facts we can say that system may suffer from the time complexity which is going to be solved by our proposed system with the help Heuristic straight line distance and A* search.
2. Related Work
Researches through tools and empirical methodologies for network research processing activities are mainly focusing on modularity concepts, reusability and ease to logical programming. The researches in this section show how intrusion detection can be performed on a network processor. Network Intrusion Detection avoidance system is described in several papers and in paper [30, 31] the authors proposed the concept of avoidance and conclude that the avoidance will be successful if the implementation of NIDS differs from the endpoint implementation. Most existing IDS are optimized to detect attacks with high accuracy. However, they still have various disadvantages that have been outlined in a number of publications and typical work has been done to analyze IDS in order to direct future research [27, 29, and 32] .Besides others, one drawback is the large amount of alerts produced some of which are redundant and unnecessary. Intrusion prevention system is technique which is used to prevent the system from the illegal activity which comes from the different sources of data. So many algorithms are designed that is either very complex to implement in hardware with slow performance [12, 15, 16]. Already lots of work and publication have been done on different type of attacks over the network many works have been done and published if we discuss about clustering and data mining techniques. Baker [2, 16] discussed the Intrusion tolerance in distributed computing and B.Linge said the application on intrusion detection based on K-means clustering technique and how k-Mean can be partition the large database for new finite data for measuring the centroid value to stop unknown new attacks .Yu Guan [18] Who worked on different mean technique by introducing Y-means algorithm clustering technique for finding and analyzing the intrusion activity? Many works have been done for IP flow-based and packet-based intrusion detection system performance in complex and high speed networks [14][15]. Chitrakar and Huang [20] had given a proposal of hybrid learning approach for integrating the feature of k-medoids technique and bayes classification rule for data partitioning and data distribution for the cluster formation and processing. Huang also focused SVM classification for anomaly detection and represents the real world scenario of data distribution. Apart from k-mean and SVM technique, Gao Guo-Hong [21] proposed a new technique that is enhanced intrusion detection model based on DBSCAN which describe the cluster formation based on density with the constraint of core points, border points and noise points to process the cluster.
Flow Based Detector:
Flow-based technique is used to control and monitoring the network traffic and provide reliable communication over the network with security.
Time Slot Based Detector: This is coming in the form of race condition and resulting in the form of resource conflict and this situation arises when one process would like to beat another program to certain events, for static detection and formulating the deadlock into the system.
Behavior Based detector: Behavior detector works with four principle for different purposes, base on nature of the inputs it defines what type of checking is require in which it is conduct the test for four cases.1Anomaly Detection, 2Signature based Detection, 3Host-Based Detection, 4Network based Detection
3. Review of related Factors:
1. Decision Tree Learning: Basic Idea behind the DTL is to test the most important attribute first. The most important is saying that which makes the most differences to the classification. And this classification must be correct and generated from small no. of test samples to represent the tree as a very small data structure.
2. Searching Methodology: Searching can be defined as the conducted test or operation which finds the location of the desired value from the memory tree. Search may be successful or unsuccessful according to their test sets. Searching is a very difficult job due to complex data structure of memory where the first data stored and some time searching leads to very worst time complexity. So before going to adapt any technique we must analyze the optimality for a given problem. Some basic operation of the any searching method is searching for node, expansion of the current state, generation of new states if value not found with available states.
3. Searching Strategy : Expansion and generation of new states is defined by the searching strategy with the assistance of following components:
a) State: A state corresponds to a configuration of the given data structure or network system.
b) Parent/root node: Generally it can be define as a initial node and leads to generate new nodes according to their requirement. Sometimes we use tree pruning technique to reduce the complexity of the tree.
c) Action: This defines the particular instruction which is applied on the parent node for generating the next level of knowledge by creating new node.
d) Path Cost: This cost can be defined as the total cost occurred to reach the destiny node pointed by the parent node and can be represented as g(n).
e) Depth: It defines the total number of steps has been traversed from initial point to desired value.
1. BFS: This is use for discrete graph structure and tree traversal searching. Principal behind the BFS is nodes that holds the least evaluation measure distance function to the goal and all it is happened by using evaluation function.
Adjacency node lists:
A: B, G, H
B: C
C: H
D: C, E
E: DESTINY
F: I
G: F, H
H: B
I: C, H
From the diagram it is clearly showing the optimal path among all available path from A to E :
A– G F ID- E
Total cost is = 10 +7+5+7+20
= 49
BFS is good in practice but in some cases it gives inaccurate result. In some cases evaluation function is giving exactly accurate result then it means that it is retuning as a best desired node, but in some cases its disappear which lead the in accuracy.
2. G-BFS: It performs to examine the specific node which is closest to the specific desired node. Basic reason is that this is supposed to lead solution outcomes quickly.
Evaluation function is : f(n)=h(n)
3. GBFS is used and implement to find malicious inputs (intruder) in small session to minimize the time complexity during huge input lines interaction. It is similar to DFS because it prefers to follow a single path all the way to reach the goal node, but will start backup when it hits a dead end. GBFS having the same problem like DFS but it is not optimal reason is it starts down an indefinite path and never returns to other possibilities.
4. Further next to BFS and GBFS algorithm that leads to evaluation function is heuristic function symbolize as h(n) .
5. Heuristic Search: Heuristic functions are most common form which is embedded with additional knowledge of the related problem to enhance the more effective searching. This additional information is works like flag for spotting the malicious input in early stage only, h(n) can be defined as the estimated cost of the cheapest path from node n to goal node. And it can be represented in the data set by measuring the distance threshold deviation with k-mean simple threshold value. Here we measure the max-min threshold value to determine the Negative and positive deviation impact.
4. Proposed System
In proposed system we are showing the potential of our present paper is basically working for speed by using informed search strategy to reduce the time complexity that work with and can be define in such a way that “it uses problem specific knowledge, and can give the optimal solution compare to other.
When heterogeneous input pass through the security scanner of hybrid intrusions system, then at that movement the analyzer will get active and start categorization of different input to different type of FBD/TSBD/BBD.
This is very challenging job reason is to filter from the continuous input stream and searching is so difficult, so our approach is work to very fast for good response by security scanner and make them efficient. Here working methodology is to collect the entire data by using A* search and heuristic efficiently.
These both techniques work is to collect data searching rapidly and create the tree structure with label data indexing. Once data is stored into the tree then it will be pass through Heuristic search if it is admissible then only proceed to A* search.
This search will find the most optimal value in the less time period and send to the table for comparing the k-mean threshold value with the std_deviation (min-max)value follow by being trigger events.
Working Module:
Input Streams : Undefined different combination of inputs stream which comes from several resources and may create the threats to the personal system, database or networks.
Security Scanner: This is device which is a integrated combination of hardware and software. This device is deployed before the firewall over the network for performing the categorization of inputs and assigns this input to various intrusion detection techniques to identify nature of inputs.
Creating Data structure Indexing: This is used to store the data in the memory temporarily with well define discrete data structure and assign the labeling flag value as a regional index for easy access.
Heuristic admissibility by A* search: A*search basic used to minimize the total estimated solution cost. This can be defined as the advance feature of BFS and show the optimality of A*, if and only if heurist straight line distance is admissible optimally. HSLD can’t define the direct value it must have to run with correlated and prior knowledge before prediction of the desired inputs.
It evaluates nodes by combining g(n) and h(n).
g(n): Cost to reach the node
h(n): Cost to get from the node to the goal
such as f(n)=g(n)+h(n) //estimated cost of the cheapest solution through n.
So, if any user wants to find the cheapest solution then A*search is complete and optimal technique.
Figure (2 ).,
The figure representing that here all the available path its not possible that will go for direct search. We need correlated value through which we can predict the next optimality, which is done heuristic and if heuristic is able to check the straight line distance feasible then it assumed to be admissible. hence A * can be optimal if heuristic is admissible.
Table (1).,
S.no Type of Detection Categories of inputs k-mean threshold value Standard deviation (min-max)value Privilege to enter imto n/w (Y/N)
1 FB D Bufferover flow 1.6279
≤ 2.000 Y
Unexpected flow 0.9876 Y
Abnormal Input 0.10176 Y
Total=mean_ ¥value (2.717/3=0.905) Y
2 TSBD Race Condition 2.0179
≤ 2.000 N
Total=mean_¥value (2.0179) N
3 BBD Anomaly 0.0198
≤2.000 Y
Signal 1.731 Y
Host 2.091 Y
N/w Based 1.311 Y
Total=mean_ ¥value(5.1528/4=1.280) Y
From table() show the hybrid structure of intrusion prevention system
Our work is running behind the time complexity during searching of data from data tree. Here we are using he heuristic and A-star search for desired value searching.
The optimality of A* is a straight forward to analyze if it is implemented with tree search and can define its optimal if h(n) is an admissible heuristic, then it shows that h(n) never overestimates the cost to reach the goal value. By nature itself admissible heuristic search is optimistic because they think that cost of solving the problem is less than it actually is.
Since g(n) is the exact cost to reach n, we have as immediate consequence that f(n) never overestimates the trust cost of a solution through n.
Proof:
We can proof logically if straight line distance optimal, using search in given Tree is optimal if h (n) is admissible.
Step 1: Let us take, destiny node ND appears on the fringe and let the cost of the optimal solution be cst*.
Step 2: Then here both ND becomes suboptimal and h (ND) =0
Step 3: We know it is then,
f(ND)= g (ND)+ h(ND)=g(ND) > optimal solution cost cst*.
Step 4: If, h(n ) does not over estimates the cost of completing the solution path, then we know that f(n)=g(n) +h(n) ≤ optimal solution cost cst*
No, we have shown that
f (n)c*< f(ND),
Step 5: So, ND will not be expanded and A*must return an optimal value.
Further,
This value will be kept in specified test sets , which will pass through the k–Mean-Simple threshold Algorithm. This technique is work by conducting
k-partitioning for given data sets. Then first select the centriods value if given otherwise user can itself declare a initial centriods. Now each point can be assigned with closest and nearest value that is centroid, and all collection of points assigned to a centroid is a cluster. The normal value is given by the value of the value metric table. This table is responsible to compare and measure the actual k-mean threshold value with predefine standard deviation value and base on the result it is liable to decide which action or privileges has to be assigned to every individual input streams.
5. Conclusions
After lots of survey and discussion made on different type intrusion detection technique associated with searching technique which is representing the different time complexity and space complexity data says that normal searching technique cannot be apply on different dynamic inputs for searching and categorizing the desired data. So if we are success to construct one hybrid system with dynamic searching technique for intrusion detection with reduced the time complexity then this will be proven module in the field of Data security-Intrusion detection system. Finally if we are success to construct dynamic searching for hybrid system which can work in any environment then only time as together cost will also be minimize with optimality.
6. Future work
Performance optimization by using HSLD- A*searching technique in Hybrid Intrusion Prevention System will be use in future in Network security for optimal searching and response along with analyzing the multiple threats in virtual environment.
7.References
1) Z. K. Baker and V. K. Prasanna. A methodology for synthesis of efficient intrusion detection systems on FPGAs. In Proceedings of the Field-Programmable Custom Computing Machines, 12th Annual IEEE Symposium on (FCCM’04), pages 135–144. IEEE Computer Society, 2004.
2) Z. K. Baker and V. K. Prasanna. Time and area efficient pattern matching on FPGAs. In Proceeding of the 2004 ACM/SIGDA 12th International Symposium on Field Programmable Gate Arrays, pages 223 232. ACM Press, 2004.
3) A. Baratloo, N. Singh, and T. Tsai. Transparent run-time defense against stack smashing attacks. In Proceedings of the USENIX Security Symposium, June 2000.
4) C. R. Clark and D. E. Schimmel. Efficient reconfigurable logic circuits for matching complex network intrusion detection patterns. In 13th International Conference on Field Programmable Logic and Applications, Sept. 2003.
5) S. A. Crosby and D. S. Wallach. Denial of service via algorithmic complexity attacks. In Proceedings of USENIX Annual Technical Conference, June 2003.
6) M. Gokhale, D. Dubois, A. Dubois, M. Boorman, S. Poole, and V. Hogsett. Granidt: Towards gigabit rate network intrusion detection technology. In Proceedings of the Reconfigurable Computing Is Going Mainstream, 12th International Conference on Field-Programmable Logic and Applications, pages 404–413. Springer-Verlag, 2002.
7) B. L. Hutchings, R. Franklin, and D. Carver. Assisting network intrusion detection with reconfigurable hardware. In Proceedings of the 10 th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’02), page 111. IEEE Computer Society, 2002.
8) K. Mai, T. Paaske, N. Jayasena, R. Ho, W. Dally, and M. Horowitz. Smart memories: A modular reconfigurable architecture. In Annual International Symposium on Computer Architecture, June 2000.
9) Xinidis, K., Anagnostakis, K.G., and Markatos, E.P., “Design and implementation of a high performance network intrusion prevention system“, Proceedings of the 20th International Information Security Conference (SEC 2005), Makuhari-Messe, Chiba, Japan, May 30 – June 1, 2005.
10) Sproull, T., and Lockwood, J., “Wide-area hardware-accelerated intrusion prevention systems (WHIPS)“, Proceedings of the International Working Conference on Active Networking (IWAN), Lawrence, Kansas, USA, October 27 – 29, 2004.
11) Song, H., and Lockwood, J.W., “Efficient packet classification for network intrusion detection using FPGA“, Proceedings of the International Symposium on Field-Programmable Gate Arrays (FPGA’05), Monterey, California, Feb 20-22, 2005.
12) S. Axelsson, “Intrusion Detection Systems: A Taxomomy and Survey,” Tech. report no. 99-15, Dept. of Comp. Eng., Chalmers Univ. of Technology, Sweden, Mar. 20, 2003.
13) Debar, H., Wespi, A.: Aggregation and correlation of intrusion detection alerts. In: 4th Workshop on Recent Advances in Intrusion Detection. Volume 2212 of Lecture Notes in Computer Science. Springer-Verlag (2001), Zurich Research Laboratory, 2001. pp 85-103 (2001)
14) Deswarte, Y., Blain, L., Fabre, J.C.: Intrusion tolerance in distributed computing systems. In: IEEE Symposium on Research in Security and Privacy, Oakland. 20-22 May-1991, pp.110–121 (1991)
15) Dutertre, B., Crettaz, V., Stavridou, V.: Intrusion-tolerant Enclaves. In: IEEE International Symposium on Security and Privacy. Oakland-CA, May, 2002.pp.216-224 (2002)
16) M. Jianliang, S. Haikun and B. Ling.: The Application on Intrusion Detection based on K- Means Cluster Algorithm. In: International Forum on Information Technology and Application, Chengdu, 15-17 may 2009.pp.150-152 (2009)
17) Chapple, M.J., Wright, T.E., Winding, R.M.: Flow Anomaly Detection in Firewalled Networks. In: Secure comm and Workshop. 006 Baltimore, MD Aug. 28 2006-Sept. 1 2006, pp.1– 6 (2006)
18) Yu Guan, Ali A. Ghorbani and Nabil Belacel.: Y-means: a clustering method for Intrusion Detection. In: Canadian Conference on Electrical and Computer Engineering, Montral, Qubec, Canada, 4-7May 2003.pp.1083-1086 (2003)
19) Zhou, Mingqiang., HuangHui, WangQian.: A Graph-based Clustering Algorithm for Anomaly Intrusion Detection. In: 7th International Conference on computer science and education (ICCSE), ,Melbourne, pp.1311-1314 (2012).
20) Chitrakar, R., and Huang Chuanhe.: Anomaly detection using Support Vector Machine Classification with K- Medoids clustering. In: 3rd Asian Himalayas International conference, Kathmandu, Nepal.23-25 November 2012.pp.1-5 (2012)
21) Li Xue-Yong, Gao Guo.: A New Intrusion Detection Method Based on Improved DBSCAN. In: WASE International conference on Information Engineering, Beidaihe, Hebai, 14-15 August 2010.pp117-120 (2010)
22) Lei Li, De-Zhang, Fang-Cheng Shen.: A novel rule-based Intrusion Detection System using data mining. In: IEEE International conference on Computer science and Information Technology, Chengdu, 9-11 July 2010.pp169-172 (2010)
23) Zhengjie, Li., Yongzhong Li., Lei Xu.: Anomaly intrusion detection method based on K-means clustering algorithm with particle swarm optimization. In: ICM, Nanjing, Jiangsu, 24-25 September 2011.pp.157-161 (2011)
24) Kapil Wankhade., Sadia Patka., Ravindra Thool.: An Overview of Intrusion Detection Based on Data Mining Techniques. In: IEEE International Conference on Communication Systems and Network Technologies, Gwalior, 6-8 April 2013, pp.626-629 (2013)
25) Schaffrath, G., Sadre, R., Morariu, C.: An Overview of IP Flow-Based Intrusion Detection. In: IEEE Communications Surveys & Tutorials, 26th April 2010. pp. 343 – 356 (2010)
26) Jadidi, Z., Muthukkumarasamy, V. ; Sithirasenan, E. ; Sheikhan, M.: Flow-Based Anomaly Detection Using Neural Network Optimized with GSA Algorithm. In: IEEE 33rd International Conference on Distributed computing System, Philadelphia, 2013 .pp. 76 – 81 (2013)
27) Ravi Ranjan., G. Sahoo.: A new clustering approach for anomaly intrusion detection. In: International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.4, No.2, Mesra, Ranchi, March (2014)
28) Malek, S.F, Khorsandi, S.: A cooperative intrusion detection algorithm based on trusted voting for mobile ad hoc network. In. 2013 21st Iranian Conference on Electrical Engineering (ICEE), Mashhad, 14-16 May 2013.pp.1-8 (2013)
29) Applying an Efficient Searching Algorithm for Intrusion Detection on Ubicom Network Processor
30) Qutaiba Ibrahim and Sahar Lazim Computer Engineering Department, University of Mosul, Iraq,2011
31) “IP3000/IP2000 Family Software Development Kit Reference Manual”, UBICOM, Inc., 28 June 2005, Web Site: http//www.ubicom.com.
32) A modern Approach to Artificial Intelligence by Staurt Russell and Peter Norvig., Second edition.
2017-9-18-1505708409