- Bayesian Networks for Named Entity Prediction in Programming Community Question Answering Within this study, we propose a new approach for natural language processing using Bayesian networks to predict and analyze the context and how this approach can be applied to the Community Question Answering domain. We discuss how Bayesian networks can detect semantic relationships and dependencies between entities, and this is connected to different score-based approaches of structure-learning. We compared the Bayesian networks with different score metrics, such as the BIC, BDeu, K2 and Chow-Liu trees. Our proposed approach out-performs the baseline model at the precision metric. We also discuss the influence of penalty terms on the structure of Bayesian networks and how they can be used to analyze the relationships between entities. In addition, we examine the visualization of directed acyclic graphs to analyze semantic relationships. The article further identifies issues with detecting certain semantic classes that are separated in the structure of directed acyclic graphs. Finally, we evaluate potential improvements for the Bayesian network approach. 2 authors · Feb 26, 2023
- Attribute-Efficient PAC Learning of Low-Degree Polynomial Threshold Functions with Nasty Noise The concept class of low-degree polynomial threshold functions (PTFs) plays a fundamental role in machine learning. In this paper, we study PAC learning of K-sparse degree-d PTFs on R^n, where any such concept depends only on K out of n attributes of the input. Our main contribution is a new algorithm that runs in time ({nd}/{epsilon})^{O(d)} and under the Gaussian marginal distribution, PAC learns the class up to error rate epsilon with O(K^{4d}{epsilon^{2d}} cdot log^{5d} n) samples even when an eta leq O(epsilon^d) fraction of them are corrupted by the nasty noise of Bshouty et al. (2002), possibly the strongest corruption model. Prior to this work, attribute-efficient robust algorithms are established only for the special case of sparse homogeneous halfspaces. Our key ingredients are: 1) a structural result that translates the attribute sparsity to a sparsity pattern of the Chow vector under the basis of Hermite polynomials, and 2) a novel attribute-efficient robust Chow vector estimation algorithm which uses exclusively a restricted Frobenius norm to either certify a good approximation or to validate a sparsity-induced degree-2d polynomial as a filter to detect corrupted samples. 2 authors · Jun 1, 2023
- Fat Polygonal Partitions with Applications to Visualization and Embeddings Let T be a rooted and weighted tree, where the weight of any node is equal to the sum of the weights of its children. The popular Treemap algorithm visualizes such a tree as a hierarchical partition of a square into rectangles, where the area of the rectangle corresponding to any node in T is equal to the weight of that node. The aspect ratio of the rectangles in such a rectangular partition necessarily depends on the weights and can become arbitrarily high. We introduce a new hierarchical partition scheme, called a polygonal partition, which uses convex polygons rather than just rectangles. We present two methods for constructing polygonal partitions, both having guarantees on the worst-case aspect ratio of the constructed polygons; in particular, both methods guarantee a bound on the aspect ratio that is independent of the weights of the nodes. We also consider rectangular partitions with slack, where the areas of the rectangles may differ slightly from the weights of the corresponding nodes. We show that this makes it possible to obtain partitions with constant aspect ratio. This result generalizes to hyper-rectangular partitions in R^d. We use these partitions with slack for embedding ultrametrics into d-dimensional Euclidean space: we give a rm polylog(Delta)-approximation algorithm for embedding n-point ultrametrics into R^d with minimum distortion, where Delta denotes the spread of the metric, i.e., the ratio between the largest and the smallest distance between two points. The previously best-known approximation ratio for this problem was polynomial in n. This is the first algorithm for embedding a non-trivial family of weighted-graph metrics into a space of constant dimension that achieves polylogarithmic approximation ratio. 3 authors · Sep 9, 2010
- Networks bijective to permutations We study the set of networks, which consist of sources, sinks and neutral points, bijective to the permutations. The set of directed edges, which characterizes a network, is constructed from a polyomino or a Rothe diagram of a permutation through a Dyck tiling on a ribbon. We introduce a new combinatorial object similar to a tree-like tableau, which we call a forest. A forest is shown to give a permutation, and be bijective to a network corresponding to the inverse of the permutation. We show that the poset of networks is a finite graded lattice and admits an EL-labeling. By use of this EL-labeling, we show the lattice is supersolvable and compute the M\"obius function of an interval of the poset. 1 authors · Feb 8, 2024
2 The snake in the Brownian sphere The Brownian sphere is a random metric space, homeomorphic to the two-dimensional sphere, which arises as the universal scaling limit of many types of random planar maps. The direct construction of the Brownian sphere is via a continuous analogue of the Cori--Vauquelin--Schaeffer (CVS) bijection. The CVS bijection maps labeled trees to planar maps, and the continuous version maps Aldous' continuum random tree with Brownian labels (the Brownian snake) to the Brownian sphere. In this work, we describe the inverse of the continuous CVS bijection, by constructing the Brownian snake as a measurable function of the Brownian sphere. Special care is needed to work with the orientation of the Brownian sphere. 4 authors · Feb 18 2
- One Tree to Rule Them All: Poly-Logarithmic Universal Steiner Tree A spanning tree T of graph G is a rho-approximate universal Steiner tree (UST) for root vertex r if, for any subset of vertices S containing r, the cost of the minimal subgraph of T connecting S is within a rho factor of the minimum cost tree connecting S in G. Busch et al. (FOCS 2012) showed that every graph admits 2^{O(log n)}-approximate USTs by showing that USTs are equivalent to strong sparse partition hierarchies (up to poly-logs). Further, they posed poly-logarithmic USTs and strong sparse partition hierarchies as open questions. We settle these open questions by giving polynomial-time algorithms for computing both O(log ^ 7 n)-approximate USTs and poly-logarithmic strong sparse partition hierarchies. For graphs with constant doubling dimension or constant pathwidth we improve this to O(log n)-approximate USTs and O(1) strong sparse partition hierarchies. Our doubling dimension result is tight up to second order terms. We reduce the existence of these objects to the previously studied cluster aggregation problem and what we call dangling nets. 6 authors · Aug 2, 2023
- Effective and Efficient Federated Tree Learning on Hybrid Data Federated learning has emerged as a promising distributed learning paradigm that facilitates collaborative learning among multiple parties without transferring raw data. However, most existing federated learning studies focus on either horizontal or vertical data settings, where the data of different parties are assumed to be from the same feature or sample space. In practice, a common scenario is the hybrid data setting, where data from different parties may differ both in the features and samples. To address this, we propose HybridTree, a novel federated learning approach that enables federated tree learning on hybrid data. We observe the existence of consistent split rules in trees. With the help of these split rules, we theoretically show that the knowledge of parties can be incorporated into the lower layers of a tree. Based on our theoretical analysis, we propose a layer-level solution that does not need frequent communication traffic to train a tree. Our experiments demonstrate that HybridTree can achieve comparable accuracy to the centralized setting with low computational and communication overhead. HybridTree can achieve up to 8 times speedup compared with the other baselines. 8 authors · Oct 18, 2023
- Treemaps with Bounded Aspect Ratio Treemaps are a popular technique to visualize hierarchical data. The input is a weighted tree tree where the weight of each node is the sum of the weights of its children. A treemap for tree is a hierarchical partition of a rectangle into simply connected regions, usually rectangles. Each region represents a node of tree and its area is proportional to the weight of the corresponding node. An important quality criterion for treemaps is the aspect ratio of its regions. One cannot bound the aspect ratio if the regions are restricted to be rectangles. In contrast, polygonal partitions, that use convex polygons, have bounded aspect ratio. We are the first to obtain convex partitions with optimal aspect ratio O(depth(tree)). However, depth(tree) still depends on the input tree. Hence we introduce a new type of treemaps, namely orthoconvex treemaps, where regions representing leaves are rectangles, L-, and S-shapes, and regions representing internal nodes are orthoconvex polygons. We prove that any input tree, irrespective of the weights of the nodes and the depth of the tree, admits an orthoconvex treemap of constant aspect ratio. We also obtain several specialized results for single-level treemaps, that is, treemaps where the input tree has depth~1. 3 authors · Dec 8, 2010
- On Two Orderings of Lattice Paths The Markov numbers are positive integers appearing as solutions to the Diophantine equation x^2 + y^2 + z^2 = 3xyz. These numbers are very well-studied and have many combinatorial properties, as well as being the source of the long-standing unicity conjecture. In 2018, Canakc{\i} and Schiffler showed that the Markov number m_{a{b}} is the number of perfect matchings of a certain snake graph corresponding to the Christoffel path from (0,0) to (a,b). Based on this correspondence, Schiffler in 2023 introduced two orderings on lattice paths. For any path omega, associate a snake graph G(omega) and a continued fraction g(omega). The ordering <_M is given by the number of perfect matchings on G(omega), and the ordering <_L is given by the Lagrange number of g(omega). In this work, we settle two conjectures of Schiffler. First, we show that the path omega(a,b) = RRcdots R UU cdots U is the unique maximum over all lattice paths from (0,0) to (a,b) with respect to both orderings <_M and <_L. We then use this result to prove that sup L(omega) over all lattice paths is exactly 1+sqrt5. 2 authors · Oct 25, 2023
- Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes Given the generational gap in available hardware between lay practitioners and the most endowed institutions, LLMs are becoming increasingly inaccessible as they grow in size. Whilst many approaches have been proposed to compress LLMs to make their resource consumption manageable, these methods themselves tend to be resource intensive, putting them out of the reach of the very user groups they target. In this work, we explore the problem of structured pruning of LLMs using only forward passes. We seek to empower practitioners to prune models so large that their available hardware has just enough memory to run inference. We develop Bonsai, a gradient-free, perturbative pruning method capable of delivering small, fast, and accurate pruned models. We observe that Bonsai outputs pruned models that (i) outperform those generated by more expensive gradient-based structured pruning methods, and (ii) are twice as fast (with comparable accuracy) as those generated by semi-structured pruning methods requiring comparable resources as Bonsai. We also leverage Bonsai to produce a new sub-2B model using a single A6000 that yields state-of-the-art performance on 4/6 tasks on the Huggingface Open LLM leaderboard. 6 authors · Feb 7, 2024